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PREFACE 


Advanced Algebra and its companion volume Basic Algebra systematically de- 
velop concepts and tools in algebra that are vital to every mathematician, whether 
pure or applied, aspiring or established. The two books together aim to give the 
reader a global view of algebra, its use, and its role in mathematics as a whole. 
The idea is to explain what the young mathematician needs to know about algebra 
in order to communicate well with colleagues in all branches of mathematics. 

The books are written as textbooks, and their primary audience is students 
who are learning the material for the first time and who are planning a career in 
which they will use advanced mathematics professionally. Much of the material 
in the two books, including nearly all of Basic Algebra and some of Advanced 
Algebra, corresponds to normal course work, with the proportions depending on 
the university. The books include further topics that may be skipped in required 
courses but that the professional mathematician will ultimately want to learn by 
self-study. The test of each topic for inclusion is whether it is something that a 
plenary lecturer at a broad international or national meeting is likely to take as 
known by the audience. 


Key topics and features of Advanced Algebra are as follows: 


e Topics build on the linear algebra, group theory, factorization of ideals, struc- 
ture of fields, Galois theory, and elementary theory of modules developed in 
Basic Algebra. 

e Individual chapters treat various topics in commutative and noncommutative 
algebra, together providing introductions to the theory of associative algebras, 
homological algebra, algebraic number theory, and algebraic geometry. 

e The text emphasizes connections between algebra and other branches of math- 
ematics, particularly topology and complex analysis. All the while, it carries 
along two themes from Basic Algebra: the analogy between integers and 
polynomials in one variable over a field, and the relationship between number 
theory and geometry. 

e Several sections in two chapters introduce the subject of Grébner bases, which 
is the modern gateway toward handling simultaneous polynomial equations in 
applications. 

e The development proceeds from the particular to the general, often introducing 
examples well before a theory that incorporates them. 


xi 
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e More than 250 problems at the ends of chapters illuminate aspects of the text, 
develop related topics, and point to additional applications. A separate section 
“Hints for Solutions of Problems” at the end of the book gives detailed hints 
for most of the problems, complete solutions for many. 


It is assumed that the reader is already familiar with linear algebra, group 
theory, rings and modules, unique factorization domains, Dedekind domains, 
fields and algebraic extension fields, and Galois theory at the level discussed in 
Basic Algebra. Not all of this material is needed for each chapter of Advanced 
Algebra, and chapter-by-chapter information about prerequisites appears in the 
Guide for the Reader beginning on page xvii. 

Historically the subjects of algebraic number theory and algebraic geometry 
have influenced each other as they have developed, and the present book tries to 
bring out this interaction to some extent. It is easy to see that there must be a close 
connection. In fact, one number-theory problem already solved by Fermat and 
Euler was to find all pairs (x, y) of integers satisfying x” + y* = n, where n isa 
given positive integer. More generally one can consider higher-order equations 
of this kind, such as y? = x* + 8x. Even this simple change of degree has a great 
effect on the difficulty, so much so that one is inclined first to solve an easier 
problem: find the rational pairs satisfying the equation. Is the search for rational 
solutions a problem in number theory or a problem about a curve in the plane? The 
answer is that really it is both. We can carry this kind of question further. Instead 
of considering solutions of a single polynomial equation in two variables, we 
can consider solutions of a system of polynomial equations in several variables. 
Within the system no individual equation is an intrinsic feature of the problem 
because one of the equations can always be replaced by its sum with another of 
the equations; if we regard each equation as an expression set equal to 0, then 
the intrinsic problem is to study the locus of common zeros of the equations. 
This formulation of the problem sounds much more like algebraic geometry than 
number theory. 

A doubter might draw a distinction between integer solutions and rational 
solutions, saying that finding integer solutions is number theory while finding 
rational solutions is algebraic geometry. Experience shows that this is an artificial 
distinction. Although algebraic geometry was initially developed as a subject that 
studies solutions for which the variables take values in a field, particularly in an 
algebraically closed field, the insistence on working only with fields imposed 
artificial limitations on how problems could be approached. In the late 1950s and 
early 1960s the foundations of the subject were transformed by allowing variables 
to take values in an arbitrary commutative ring with identity. The very end of this 
book aims to give some idea of what those new foundations are. 

Along the way we shall observe parallels between number theory and algebraic 
geometry, even as we nominally study one subject at a time. The book begins with 
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a chapter on those aspects of number theory that mark the historical transition 
from classical number theory to modern algebraic number theory. Chapter I deals 
with three celebrated advances of Gauss and Dirichlet in classical number theory 
that one might wish to generalize by means of algebraic number theory. The 
detailed level of knowledge that one gains about those topics can be regarded as 
a goal for the desired level of understanding about more complicated problems. 
Chapter I thus establishes a framework for the whole book. 

Associative algebras are the topic of Chapters II and III. The tools for studying 
such algebras provide methods for classifying noncommutative division rings. 
One such tool, known as the Brauer group, has a cohomological interpretation 
that ties the subject to algebraic number theory. 

Because of other work done in the 1950s, homology and cohomology can be 
abstracted in such a way that the theory impacts several fields simultaneously, 
including topology and complex analysis. The resulting subject is called homo- 
logical algebra and is the topic of Chapter IV. Having cohomology available at this 
point of the present book means that one is prepared to use it both in algebraic 
number theory and in situations in algebraic geometry that have grown out of 
complex analysis. 

The last six chapters are about algebraic number theory, algebraic geometry, 
and the relationship between them. Chapters V—VI concern the three main 
foundational theorems in algebraic number theory. Chapter V goes at these 
results in a direct fashion but falls short of giving a complete proof in one case. 
Chapter VI goes at matters more indirectly. It explores the parallel between 
number theory and the theory of algebraic curves, makes use of tools from analysis 
concerning compactness and completeness, succeeds in giving full proofs of the 
three theorems of Chapter V, and introduces the modern approach via adeles and 
ideles to deeper questions in these subject areas. 

Chapters VII—-X are about algebraic geometry. Chapter VII fills in some 
prerequisites from the theories of fields and commutative rings that are needed to 
set up the foundations of algebraic geometry. Chapters VII—X concern algebraic 
geometry itself. They come at the subject successively from three points of 
view —from the algebraic point of view of simultaneous systems of polynomial 
equations in several variables, from the number-theoretic point of view suggested 
by the classical theory of Riemann surfaces, and from the geometric point of view. 


The topics most likely to be included in normal course work include the 
Wedderburn theory of semisimple algebras in Chapter II, homological algebra 
in Chapter IV, and some of the advanced material on fields in Chapter VI. A 
chart on page xvi tells the dependence of chapters on earlier chapters, and, as 
mentioned above, the section Guide for the Reader tells what knowledge of Basic 
Algebra is assumed for each chapter. 

The problems at the ends of chapters are intended to play a more important 
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role than is normal for problems in a mathematics book. Almost all problems are 
solved in the section of hints at the end of the book. This being so, some blocks of 
problems form additional topics that could have been included in the text but were 
not; these blocks may be regarded as optional topics, or they may be treated as 
challenges for the reader. The optional topics of this kind usually either carry out 
further development of the theory or introduce significant applications to other 
branches of mathematics. For example a number of applications to topology are 
treated in this way. 

Not all problems are of this kind, of course. Some of the problems are 
really pure or applied theorems, some are examples showing the degree to which 
hypotheses can be stretched, and a few are just exercises. The reader gets no 
indication which problems are of which type, nor of which ones are relatively 
easy. Each problem can be solved with tools developed up to that point in the 
book, plus any additional prerequisites that are noted. 

The theorems, propositions, lemmas, and corollaries within each chapter are 
indexed by a single number stream. Figures have their own number stream, and 
one can find the page reference for each figure from the table on page xv. Labels 
on displayed lines occur only within proofs and examples, and they are local to the 
particular proof or example in progress. Each occurrence of the word “PROOF” 
or “PROOF” is matched by an occurrence at the right margin of the symbol U to 
mark the end of that proof. 


I am grateful to Ann Kostant and Steven Krantz for encouraging this project 
and for making many suggestions about pursuing it, and I am indebted to David 
Kramer, who did the copyediting. The typesetting was by AjyS-TpxX, and the 
figures were drawn with Mathematica. 

I invite corrections and other comments from readers. I plan to maintain a list 
of known corrections on my own Web page. 

A. W. KNAPP 
August 2007 
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DEPENDENCE AMONG CHAPTERS 


Below is a chart of the main lines of dependence of chapters on prior chapters. 
The dashed lines indicate helpful motivation but no logical dependence. Apart 
from that, particular examples may make use of information from earlier chapters 
that is not indicated by the chart. 
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GUIDE FOR THE READER 


This section is intended to help the reader find out what parts of each chapter are 
most important and how the chapters are interrelated. Further information of this 
kind is contained in the abstracts that begin each of the chapters. 

The book treats its subject material as pointing toward algebraic number 
theory and algebraic geometry, with emphasis on aspects of these subjects that 
impact fields of mathematics other than algebra. Two chapters treat the theory 
of associative algebras, not necessarily commutative, and one chapter treats 
homological algebra; both these topics play a role in algebraic number theory and 
algebraic geometry, and homological algebra plays an important role in topology 
and complex analysis. The constant theme is a relationship between number 
theory and geometry, and this theme recurs throughout the book on different 
levels. 

The book assumes knowledge of most of the content of Basic Algebra, either 
from that book itself or from some comparable source. Some of the less standard 
results that are needed from Basic Algebra are summarized in the section Notation 
and Terminology beginning on page xxi. The assumed knowledge of algebra 
includes facility with using the Axiom of Choice, Zorn’s Lemma, and elementary 
properties of cardinality. All chapters of the present book but the first assume 
knowledge of Chapters I-IV of Basic Algebra other than the Sylow Theorems, 
facts from Chapter V about determinants and characteristic polynomials and 
minimal polynomials, simple properties of multilinear forms from Chapter VI, 
the definitions and elementary properties of ideals and modules from Chapter VIII, 
the Chinese Remainder Theorem and the theory of unique factorization domains 
from Chapter VIII, and the theory of algebraic field extensions and separability 
and Galois groups from Chapter IX. Additional knowledge of parts of Basic 
Algebra that is needed for particular chapters is discussed below. In addition, 
some sections of the book, as indicated below, make use of some real or complex 
analysis. The real analysis in question generally consists in the use of infinite 
series, uniform convergence, differential calculus in several variables, and some 
point-set topology. The complex analysis generally consists in the fundamentals 
of the one-variable theory of analytic functions, including the Cauchy Integral 
Formula, expansions in convergent power series, and analytic continuation. 


The remainder of this section is an overview of individual chapters and groups 
of chapters. 
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Chapter I concerns three results of Gauss and Dirichlet that marked a transition 
from the classical number theory of Fermat, Euler, and Lagrange to the algebraic 
number theory of Kummer, Dedekind, Kronecker, Hermite, and Eisenstein. These 
results are Gauss’s Law of Quadratic Reciprocity, the theory of binary quadratic 
forms begun by Gauss and continued by Dirichlet, and Dirichlet’s Theorem on 
primes in arithmetic progressions. Quadratic reciprocity was a necessary prelimi- 
nary for the theory of binary quadratic forms. When viewed as giving information 
about a certain class of Diophantine equations, the theory of binary quadratic 
forms gives a gauge of what to hope for more generally. The theory anticipates 
the definition of abstract abelian groups, which occurred later historically, and 
it anticipates the definition of the class number of an algebraic number field, at 
least in the quadratic case. Dirichlet obtained formulas for the class numbers 
that arise from binary quadratic forms, and these formulas led to the method by 
which he proved his theorem on primes in arithmetic progressions. Much of the 
chapter uses only elementary results from Basic Algebra. However, Sections 6-7 
use facts about quadratic number fields, including the multiplication of ideals 
in their rings of integers, and Section 10 uses the Fourier inversion formula for 
finite abelian groups, which is in Section VII.4 of Basic Algebra. Sections 8-10 
make use of a certain amount of real and complex analysis concerning uniform 
convergence and properties of analytic functions. 


Chapters II-III introduce the theory of associative algebras over fields. Chap- 
ter II includes the original theory of Wedderburn, including an amplification by 
E. Artin, while Chapter III introduces the Brauer group and connects the theory 
with the cohomology of groups. The basic material on simple and semisimple 
associative algebras is in Sections 1-3 of Chapter II, which assumes familiarity 
with commutative Noetherian rings as in Chapter VIII of Basic Algebra, plus the 
material in Chapter X on semisimple modules, chain conditions for modules, and 
the Jordan—Holder Theorem. Sections 4—6 contain the statement and proof of 
Wedderburn’s Main Theorem, telling the structure of general finite-dimensional 
associative algebras in characteristic 0. These sections include a relatively self- 
contained segment from Proposition 2.29 through Proposition 2.33’ on the role 
of separability in the structure of tensor products of algebras. This material is the 
part of Sections 4—6 that is used in the remainder of the chapter to analyze finite- 
dimensional associative division algebras over fields. Two easy consequences of 
this analysis are Wedderburn’s Theorem that every finite division ring is com- 
mutative and Frobenius’s Theorem that the only finite-dimensional associative 
division algebras over R are R, C, and the algebra H of quaternions, up to R 
isomorphism. 

Chapter III introduces the Brauer group to parametrize the isomorphism classes 
of finite-dimensional associative division algebras whose center is a given field. 
Sections 2—3 exhibit an isomorphism of a relative Brauer group with what turns 
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out to be a cohomology group in degree 2. This development runs parallel to 
the theory of factor sets for groups as in Chapter VII of Basic Algebra, and 
some familiarity with that theory can be helpful as motivation. The case that the 
relative Brauer group is cyclic is of special importance, and the theory is used in 
the problems to construct examples of division rings that would not have been 
otherwise available. The chapter makes use of material from Chapter X of Basic 
Algebra on the tensor product of algebras and on complexes and exact sequences. 

Chapter IV is about homological algebra, with emphasis on connecting homo- 
morphisms, long exact sequences, and derived functors. All but the last section is 
done in the context of “good” categories of unital left R modules, R being a ring 
with identity, where it is possible to work with individual elements in each object. 
The reader is expected to be familiar with some example for motivation; this can 
be knowledge of cohomology of groups at the level of Section III.5, or it can be 
some experience from topology or from the cohomology of Lie algebras as treated 
in other books. Knowledge of complexes and exact sequences from Chapter X 
of Basic Algebra is prerequisite. Homological algebra properly belongs in this 
book because it is fundamental in topology and complex analysis; in algebra 
its role becomes significant just beyond the level of the current book. Important 
applications are not limited in practice to “good” categories; “sheaf” cohomology 
is an example with significant applications that does not fit this mold. Section 8 
sketches the theory of homological algebra in the context of “abelian” categories. 
In this case one does not have individual elements at hand, but some substitute is 
still possible; sheaf cohomology can be treated in this context. 

Chapters V and VI are an introduction to algebraic number theory. The theory 
of Dedekind domains from Chapters VII and IX of Basic Algebra is taken as 
known, along with knowledge of the ingredients of the theory — Noetherian rings, 
integral closure, and localization. Both chapters deal with three theorems —the 
Dedekind Discriminant Theorem, the Dirichlet Unit Theorem, and the finiteness 
of the class number. Chapter V attacks these directly, using no additional tools, 
and it comes up a little short in the case of the Dedekind Discriminant Theorem. 
Chapter VI introduces tools to get around the weakness of the development in 
Chapter V. These tools are valuations, completions, and decompositions of tensor 
products of fields with complete fields. Chapter VI makes extensive use of metric 
spaces and completeness, and compactness plays an important role in Sections 
9-10. As noted in remarks with Proposition 6.7, Section V1.2 takes for granted 
that Theorem 8.54 of Basic Algebra about extensions of Dedekind domains does 
not need separability as a hypothesis; the actual proof of the improved theorem 
without a hypothesis of separability is deferred to Section VIL.3. 

Chapter VII supplies additional background needed for algebraic geometry, 
partly from field theory and partly from the theory of commutative rings. Knowl- 
edge of Noetherian rings is needed throughout the chapter. Sections 4-5 assume 
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knowledge of localizations, and the indispensable Corollary 7.14 in Section 3 
concerns Dedekind domains. The most important result is the Nullstellensatz 
in Section 1. Transcendence degree and Krull dimension in Sections 2 and 4 
are tied to the notion of dimension in algebraic geometry. Zariski’s Theorem 
in Section 5 is tied to the notion of singularities; part of its proof is deferred to 
Chapter X. The material on infinite Galois groups in Section 6 has applications 
to algebraic number theory and algebraic geometry but is not used in this book 
after Chapter VII. 

Chapters VII—X introduce algebraic geometry from three points of view. 
Chapter VIII approaches it as an attempt to understand solutions of simulta- 
neous polynomial equations in several variables using module-theoretic tools. 
Chapter IX approaches the subject of curves as an outgrowth of the complex- 
analysis theory of compact Riemann surfaces and uses number-theoretic methods. 
Chapter X approaches its subject matter geometrically, using the field-theoretic 
and ring-theoretic tools developed in Chapter VII. All three chapters assume 
knowledge of Section VII.1 on the Nullstellensatz. 

Chapter VIII is in three parts. Sections 1-4 are relatively elementary and 
concern the resultant and preliminary forms of Bezout’s Theorem. Sections 
5-6 concern intersection multiplicity for curves and make extensive use of lo- 
calizations; the goal is a better form of Bezout’s Theorem. Sections 7-10 
are independent of Sections 5—6 and introduce the theory of Grobner bases. 
This subject was developed comparatively recently and lies behind many of the 
symbolic manipulations of polynomials that are possible with computers. 

Chapter IX concerns irreducible curves and is in two parts. Sections 1-3 define 
divisors and the genus of such a curve, while Sections 4-5 prove the Riemann— 
Roch Theorem and give applications of it. The tool for the development is discrete 
valuations as in Section VI.2, and the parallel between the theory in Chapter VI 
for algebraic number fields and the theory in Chapter IX for curves becomes more 
evident than ever. Some complex analysis is needed to understand the motivation 
in Sections | and 4. 

Chapter X largely concerns algebraic sets defined as zero loci over an alge- 
braically closed field. The irreducible such sets are called varieties. Sections 1-3 
are concerned with algebraic sets and their dimension, Sections 4—6 treat maps 
between varieties, and Sections 7-8 deal with finer questions. Sections 9-12 
are independent of Sections 6-8 and do two things simultaneously: they tie the 
theoretical work on dimension to the theory of Grébner bases in Chapter VIII, 
making dimension computable, and they show how the dimension of a zero locus 
is affected by adding one equation to the defining system. The chapter concludes 
with an introductory section about schemes, in which the underlying algebraically 
closed field is replaced by a commutative ring with identity. The entire chapter 
assumes knowledge of elementary point-set topology. 
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This section contains some items of notation and terminology from Basic Algebra 
that are not necessarily reviewed when they occur in the present book. A few 


results are mentioned as well. The items are grouped by topic. 


Set theory 
€ 

#S or |S| 

2) 

{x € E | P} 


EUF, ENF, E-F 
U, Ea, Aq Ea 
EDF 
ECF,E 2 F 

(a1, ..-,4n) 
{a1,..-,4n} 

fi: Eo F,xw f(x) 
fogorfg, fl, 
Fy) 

f(E), f'(E) 

in one-one correspondence 
countable 

QA 


Number systems 
(i) . 

n positive, n negative 
Z,Q,R,C 

max, min 

[x] 

Re z, Im z 

Z 

Iz| 


membership symbol 

number of elements in S$ 

empty set 

the set of x in E such that P holds 
complement of the set FE 

union, intersection, difference of sets 

union, intersection of the sets Eq 

containment 

proper containment 

ordered n-tuple 

unordered n-tuple 

function, effect of function 

composition of f following g, restriction to EF 
the function x bh f(x, y) 

direct and inverse image of a set 

matched by a one-one onto function 

finite or in one-one correspondence with integers 
set of all subsets of A 


Kronecker delta: 1 ifi = j,0ifi # j 
binomial coefficient 

n>0O,n <0 

integers, rationals, reals, complex numbers 
maximum/minimum of finite subset of reals 
greatest integer < x if x is real 

real and imaginary parts of complex z 
complex conjugate of z 

absolute value of z 
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Notation and Terminology 


Linear algebra and elementary number theory 


F" 
ej 
Vy’ 
dimpg V or dim V 


space of n-dimensional column vectors 
j" standard basis vector of F” 

dual vector space of vector space V 
dimension of vector space V over field F 
zero vector, matrix, or linear mapping 
identity matrix or linear mapping 
transpose of A 

determinant of A 

matrix with (i, 7) entry Mj; 

matrix of L relative to domain ordered basis I" 
and range ordered basis A 

dot product 

is isomorphic to, is equivalent to 
integers modulo a prime p, as a field 
greatest common divisor 

is congruent to 

Euler’s g function 


Groups, rings, modules, and categories 


Mmn(R) 

M,(R) 

unital left R module 
Homr(M, N) 
Endr(M) 

ker g, image 
H"(G,N) 


simple left R module 
semisimple left R module 


Obj(C) 
Morph¢(A, B) 


additive identity in an abelian group 
multiplicative identity in a group or ring 

is isomorphic to, is equivalent to 

cyclic group of order m 

invertible element in ring R with identity 
group of units in ring R with identity 

space of column vectors with entries in ring R 
opposite ring to R witha ob = ba 

m-by-n matrices with entries in R 

n-by-n matrices with entries in R 

left R module M with 1m = m for allm € M 
group of R homomorphisms from M into N 
ring of R homomorphisms from M into M 
kernel and image of @ 

n™ cohomology of group G with coefficients 
in abelian group N 

nonzero unital left R module with no proper 
nonzero R submodules 

sum (= direct sum) of simple left R modules 
class of objects for category C 

set of morphisms from object A to object B 
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Groups, rings, modules, and categories, continued 


14 
cs 
product of {Xs}ses 


coproduct of {Xs}ses 


(OPP 


identity morphism on A 

category of S-tuples of objects from Obj(C) 
(X, {ps}ses) such that if A in Obj(C ) and 
{gs € Morphe(A, X;)} are given, then there 
exists a unique g € Morph (A, X) with 
Ds? = sz for all s 

(X, {is}ses) such that if A in Obj(C) and 
{gs € Morphe(X,, A)} are given, then there 
exists a unique g € Morph (X, A) with 
gis = @, for all s 

category opposite to C 


Commutative rings R with identity and factorization of elements 


identity 

ideal J = (rj,...,7n) 
prime ideal / 

integral domain 

R/I with I prime 
GL(n, R) 


Chinese Remainder Theorem /, .. 


Nakayama’s Lemma 


algebra A over R 
RG 

R[X, sar St Xn] 
R[x1, Sus Xn] 


irreducible element r ~ 0 
prime elementr 4 0 


irreducible vs. prime 


GCD 


denoted by 1, allowed to equal 0 

ideal generated by rj, ..., 7p 

proper ideal withab € J implyinga € Jorbel 
R with no zero divisors and with 1 4 0 

always an integral domain 

group of invertible n-by-n matrices, entries in R 
., I, given ideals with J; +1; = R fori ¥ j. 
Then the natural map g : R > TTj=1 R/I; yields 
isomorphism R / az I, = R/l, x +++ x R/I, 
of rings. Also a= T= [eee In. 

If J is an ideal contained in all maximal ideals 
and M is a finitely generated unital R module 
with ]M = M,then M = 0. 

unital R module with an R bilinear multiplication 
A x A-— A. In this book nonassociative 
algebras appear only in Chapter II, and each 
associative algebra has an identity. 

group algebra over R for group G 

polynomial algebra over R with n indeterminates 
R algebra generated by x1, ..., Xp 

r € R™ such that r=ab implies a € R™ or be R* 
r ¢ R* such that whenever r divides ab, then 

r divides a or r divides b 

prime implies irreducible; in any unique 
factorization domain, irreducible implies prime 
greatest common divisor in unique factorization 
domain 
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Fields 

Fy a finite field with g = p” elements, p prime 
K/F an extension field K of a field F 

[K : F] degree of extension K/F,,i.e., dimp K 
K(X, ..., Xn) field of fractions of K[X),..., Xy] 
K(x1,..., Xn) field generated by K and x1,...,Xy 
number field finite-dimensional field extension of Q 
Gal(K /F) Galois group, automorphisms of K fixing F 
Nx;r(-) and Trx;Fr(-) norm and trace functions from K to F 


Tools for algebraic number theory and algebraic geometry 


Noetherian R commutative ring with identity whose ideals 
satisfy the ascending chain condition; has the 
property that any R submodule of a finitely 
generated unital R module is finitely generated. 

Hilbert Basis Theorem R nonzero Noetherian implies R[X] Noetherian 

Integral closure 

Situation: R = integral domain, F = field of fractions, K /F = extension field. 


x € K integral over R x is a root of a monic polynomial in R[X] 
integral closure of Rin K — setof x € K integral over R, is a ring 

R integrally closed R equals its integral closure in F 

Localization 

Situation: R = commutative ring with identity, S = multiplicative system in R. 
S-'R localization, pairs (r,s) withr € Rands eS, 


modulo (r, s) ~ (r’, s’) if t(rs’ — sr’) =0 
for some t € S 


property of S~'R I ++ S7'T is one-one from set of ideals J in R 
of form J = RM J onto set of ideals in S~'R 

local ring commutative ring with identity having a unique 
maximal ideal 

Rp for prime ideal P localization with S = complement of P in R 

Dedekind domain Noetherian integrally closed integral domain in 


which every nonzero prime ideal is maximal, has 
unique factorization of nonzero ideals as product 
of prime ideals 

Dedekind domain extension R Dedekind, F field of fractions, K /F finite 
separable extension, T integral closure of R in K. 
Then T is Dedekind, and any nonzero prime ideal 
gin R has pR =[]|‘_, P," for distinct prime 
ideals P; with P; 1 R = g. These have )~¥_, ef; 
=[K: F], where f, = [T/P, : R/g]. 


CHAPTER I 


Transition to Modern Number Theory 


Abstract. This chapter establishes Gauss’s Law of Quadratic Reciprocity, the theory of binary 
quadratic forms, and Dirichlet’s Theorem on primes in arithmetic progressions. 

Section | outlines how the three topics of the chapter occurred in natural sequence and marked 
a transition as the subject of number theory developed a coherence and moved toward the kind of 
algebraic number theory that is studied today. 

Section 2 establishes quadratic reciprocity, which is a reduction formula providing a rapid method 
for deciding solvability of congruences x* = m mod p for the unknown x when p is prime. 

Sections 3-5 develop the theory of binary quadratic forms ax? + bxy + cy”, where a, b, c are 
integers. The basic tool is that of proper equivalence of two such forms, which occurs when the two 
forms are related by an invertible linear substitution with integer coefficients and determinant 1. The 
theorems establish the finiteness of the number of proper equivalence classes for given discriminant, 
conditions for the representability of primes by forms of a given discriminant, canonical representa- 
tives of the finitely many proper equivalence classes of a given discriminant, a group law for proper 
equivalence classes of forms of the same discriminant that respects representability of integers by 
the classes, and a theory of genera that takes into account inequivalent forms whose values cannot 
be distinguished by linear congruences. 

Sections 6—7 digress to leap forward historically and interpret the group law for proper equivalence 
classes of binary quadratic forms in terms of an equivalence relation on the nonzero ideals in the 
ring of integers of an associated quadratic number field. 

Sections 8-10 concern Dirichlet’s Theorem on primes in arithmetic progressions. Section 8 
discusses Euler’s product formula for }>°° , n~* and shows how Euler was able to modify it to 
prove that there are infinitely many primes 4k + 1 and infinitely many primes 4k + 3. Section 9 
develops Dirichlet series as a tool to be used in the generalization, and Section 10 contains the proof 
of Dirichlet’s Theorem. Section 8 uses some elementary real analysis, and Sections 9-10 use both 
elementary real analysis and elementary complex analysis. 


1. Historical Background 


The period 1800 to 1840 saw great advances in number theory as the subject 
developed a coherence and moved toward the kind of algebraic number theory that 
is studied today. The groundwork had been laid chiefly by Euclid, Diophantus, 
Fermat, Euler, Lagrange, and Legendre. Some of what those people did was 
remarkably insightful for its time, but what collectively had come out of their 
labors was more a collection of miscellaneous results than an organized theory. 
It was Gauss who first gave direction and depth to the subject, beginning with 
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his book Disquisitiones Arithmeticae in 1801. Dirichlet built on Gauss’s work, 
clarifying the deeper parts and adding analytic techniques that pointed toward 
the integrated subject of the future. This chapter concentrates on three jewels of 
classical number theory — largely the work of Gauss and Dirichlet—that seem on 
the surface to be only peripherally related but are actually a natural succession 
of developments leading from earlier results toward modern algebraic number 
theory. To understand the context, it is necessary to back up for a moment. 

Diophantine equations in two or more variables have always lain at the heart of 
number theory. Fundamental examples that have played an important role in the 
development of the subject areax?+bxy+cy” = m for unknown integers x and y; 
Rd a = m for unknown integers x1, x2, X3, X43 y? =x(x-lx«t+l) 
for unknown integers x and y; and x” + y” = z” for unknown integers x, y, z. 

In every case one can get an immediate necessary condition on a solution by 
writing the equation modulo some integer n. The necessary condition is that 
the corresponding congruence modulo n have a solution. For example take the 
equation x? + y? = p, where p is a prime, and let us allow ourselves to use 
the more elementary results of Basic Algebra. Writing the equation modulo 
p leads to x* + y? = Omod p. Certainly x cannot be divisible by p, since 
otherwise y would be divisible by p, x? and y? would be divisible by p?, and 
x* + y? = p would be divisible by p?, contradiction. Thus we can divide, 
obtaining 1 + (yx~!)? = 0 mod p. Hence z* = —1 mod p for z = xy~!. If p 
is an odd prime, then —1 has order 2, and the necessary condition is that there 
exist some z in F whose order is exactly 4. Since F> is cyclic of order p — 1, 
the necessary condition is that 4 divide p — 1. 

Using a slightly more complicated argument, we can establish conversely that 
the divisibility of p — 1 by 4 implies that x? + y? = p is solvable for integers 
x and y. In fact, we know from the solvability of z> = —1 mod p that there 
exists an integer r such that p divides r* + 1. Consider the possibilities in the 
integral domain Z[i] of Gaussian integers, where i = /—1. It was shown in 
Chapter VIII of Basic Algebra that Z[i] is Euclidean. Hence Z[/] is a principal 
ideal domain, and its elements have unique factorization. If p remains prime in 
Z{i], then the fact that p divides (r + i)(r — i) implies that p divides r + i or 
r —i in Z[i]. Then at least one of 5 | it and a it would have to be in Z[i]. 
Since is is not in Z[i], this divisibility does not hold, and we conclude that p 
does not remain prime in Z[i]. If we write p = (a + bi)(c + di) nontrivially, 
then p? = |a + bi|?|c + di|? = (a? +. b”)(c? + d’) as an equality in Z, and we 
readily conclude that a* + b? = p. 

This much argument solves the Diophantine equation x*+ y* = p for p prime. 
For p replaced by a general integer m, we use the identity 


(x? + y?)(03 + y3) = (ie — yiyn)? + ry + x21), 
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which has been known since antiquity, and we see that x? + y* = m is solvable 
if m is a product of odd primes of the form 4k + 1. It is solvable also if m = 2 
and if m = p? for any prime p. Thus x* + y? = m is solvable whenever m is a 
positive integer such that each prime of the form 4k + 3 dividing m divides m an 
even number of times. Using congruences modulo prime powers, we see that this 
condition is also necessary, and we arrive at the following result; historically it 
had already been asserted as a theorem by Fermat and was subsequently proved 
by Euler, albeit by more classical methods than we have used. 


Proposition 1.1. The Diophantine equation x?+ y* = m is solvable in integers 
x and y for a given positive integer m if and only if every prime number p = 4k+3 
dividing m occurs an even number of times in the prime factorization of m. 


The first step in the above argument used congruence information; we had 
to know the primes p for which z7 = —1 mod p is solvable. The second step 
was in two parts—both rather special. First we used specific information about 
the nature of factorization in a particular ring of algebraic integers, namely Z[/]. 
Second we used that the norm of a product is the product of the norms in that 
same ring of algebraic integers. 

It is too much to hope that some recognizable generalization of these steps with 
x? + y? = man handle all or most Diophantine equations. At least the first step 
is available in complete generality, and indeed number theory — both classical and 
modern— deduces many helpful conclusions by passing to congruences. There 
is the matter of deducing something useful from a given congruence, but doing 
so is a finite problem for each prime. Like some others before him, Gauss set 
about studying congruences systematically. Linear congruences are easy and had 
been handled before. Quadratic congruences are logically the next step. The 
first jewel of classical number theory to be discussed in this chapter is the Law 
of Quadratic Reciprocity of Gauss, which appears below as Theorem 1.2 and 
which makes useful deductions possible in the case of quadratic congruences. In 
effect quadratic reciprocity allows one to decide easily which integers are squares 
modulo a prime p. Euler had earlier come close to finding the statement of this 
result, and Legendre had found the exact statement without finding a complete 
proof. Gauss was the one who gave the first complete proof. 


Part of the utility of quadratic reciprocity is that it helps one to attack quadratic 
Diophantine equations more systematically. The second jewel of classical number 
theory to be discussed in this chapter is the body of results concerning representing 
integers by binary quadratic forms ax” -++bxy +cy* = m that do not degenerate in 
some way. Lagrange and Legendre had already made advances in this theory, but 
Gauss’s own discoveries were decisive. Dirichlet simplified the more advanced 
parts of the theory and investigated an aspect of it that Gauss had not addressed 


4 I. Transition to Modern Number Theory 


and that would lead Dirichlet to his celebrated theorem on primes in arithmetic 
progressions.! 

Lagrange had introduced the notion of the discriminant of a quadratic form 
and a notion of equivalence of such forms—two forms of the same discriminant 
being equivalent if one can be obtained from the other by a linear invertible 
substitution with integer entries. Equivalence is important because equivalent 
forms represent the same numbers. He established also a theory of reduced forms 
that specifies representatives of each equivalence class. For an odd prime p, 
ax? + bxy + cy” = p is solvable only if the discriminant b? — 4ac is a square 
modulo p, and Lagrange was hampered by not knowing quadratic reciprocity. 
But he did know some special cases, such as when 5 is a square modulo p, and he 
was able to deal completely with discriminant —20. For this discriminant, there 
are two equivalence classes, represented by x7 + Sy? and 2x” + 2xy + 3y’, and 
Lagrange showed for primes p other than 2 and 5 that 


x°45y =p is solvable if and only if p = 1or9 mod 20, 


2x? + 2xny +3y" = p is solvable if and only if p =3or7 mod 20; 


the fact about x? + 5y? = p had been conjectured earlier by Euler. Lagrange 
observed further that 


(2x? +2x1;y,) + By?) (2x5 + 2x2y2 + 3y3) 
= (2x1x2 + x1 92 + yix2 + 3y1y2)? + 5(x1x2 — y1y2)’, 


from which it follows that the product of two primes congruent to 3 or 7 modulo 
20 is representable as x” + Sy; this fact had been conjectured by Fermat. 

Legendre added to this investigation the correct formula for quadratic reci- 
procity, which he incorrectly believed he had proved, and many of its conse- 
quences for representability of primes by binary quadratic forms. In addition, 
he tried to develop a theory of composition of forms that generalizes Lagrange’s 
identity above, but he had only limited success. 

In addition to establishing quadratic reciprocity, Gauss introduced the vital no- 
tion of “proper equivalence” for forms ax*+bxy+cy? of the same discriminant — 
two forms of the same discriminant being properly equivalent if one can be 
obtained from the other by a linear invertible substitution with integer entries 
and determinant +1. In terms of this definition, he settled the representability 
of primes by binary quadratic forms, he showed that there are only finitely many 
proper equivalences classes for each discriminant, and he gave an algorithm for 


'These matters are affirmed in Dirichlet’s Lectures on Number Theory. The aspect that Gauss 
had not addressed and that provided motivation for Dirichlet is the value of the “Dirichlet class 
number” /(D) defined below. 
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deciding whether two forms are properly equivalent. The main results of Gauss in 
this direction appear as Theorems 1.6 and 1.8 below. In addition, Gauss showed, 
without the benefit of having a definition of “group,” in effect that the set of 
proper equivalence classes of forms with a given discriminant becomes a finite 
abelian group in a way that controls representability of nonprime integers; by 
contrast, Lagrange’s definition of equivalence does not lead to a group structure. 
Gauss’s main results in this direction, as recast by Dirichlet, appear as Theorem 
1.12 below. 

The story does not stop here, but let us pause for a moment to say what La- 
grange’s theory, as amended by Gauss, says for the above example, first rephrasing 
the context in more modern terminology. We saw earlier that unique factorization 
in the ring Z[i] of Gaussian integers is the key to the representation of integers 
by the quadratic form x* + y?. For a general quadratic form ax? + bxy + cy? 
with discriminant D = b* — 4ac, properties of the ring R of algebraic integers in 
the field Q(/D ) are relevant for the questions that Gauss investigated. It turns 
out that R is a principal ideal domain if Gauss’s finite abelian group of proper 
equivalence classes is trivial and that when D is “fundamental,” there is a suitable 
converse.” 

With the context rephrased we come back to the example. Consider the 
equation x*+5y* = p for primes p. The discriminant of x* +5y? is —20, and the 
relevant ring of algebraic integers is Z[,/—5 ], which is not a unique factorization 
domain. Thus the argument used with x7 + y? = p does not apply, and we 
have no reason to expect that solvability of x? + 5y? =0 mod p is sufficient for 
solvability of x*-+5y? = p. Let us look more closely. The congruence condition 
is that —20 is a square modulo p. Thus —5 is to be a square modulo p. If we 
leave aside the primes p = 2 and p = 5 that divide 20, the Law of Quadratic 
Reciprocity will tell us that the necessary congruence resulting from solvability 
of x* + 5y? = p is that p be congruent to 1,3, 7, or 9 modulo 20. However, we 
can compute all residues n of x* + 5y* modulo 20 for n with GCD(n, 20) = | to 
see that 


x°+5y? =lor9mod20 if GCD(x7 + 5y”, 20) = 1. 


Meanwhile, the form 2x? + 2xy + 3y? has discriminant —20, and we can check 
that solvability of 2x? + 2xy + 3y? = p leads to the conclusion that 


2x? + 2xy +3y? =30r7mod20 if GCD(2x* + 2xy +3y’, 20) = 1. 


Lagrange’s theory easily shows that representability of integers by a form depends 
only on the equivalence class of the form and that all primes congruent to 1, 3, 


2In each of the situations (a) and (b) of Proposition 1.17 below, R is a principal ideal domain 
only if Gauss’s group is trivial. In all other cases, Gauss’s group is nontrivial, and R is a principal 
ideal domain only if the group has order 2. 
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7, or 9 modulo 20 are representable by some form. This example is special 
in that equivalence and proper equivalence come to the same thing. Gauss’s 
multiplication rule for proper equivalence classes of forms with discriminant 
—20 produces a group of order 2, with x* + Sy? representing the identity class 
and 2x? + 2xy + 3y* representing the other class. Consequently 


p = 1or9 mod 20 implies x? +5y? =p solvable, 


p =3or7 mod 20 implies 2x* + 2xy +3y? =p solvable. 


In addition, the multiplication rule has the property that if m is representable by 
all forms in the class of a,x* + bjxy + cy” and n is representable by all forms 
in the class of ayx* + boxy + cry’, then mn is representable by all forms in the 
class of the product form. It is not necessary to have an explicit identity for the 
multiplication. Thus, for example, it follows without further argument that if p 
and q are primes congruent to 3 or 7 modulo 20, then x? + 5y* = pq is solvable. 

Let us elaborate a little about the rephrased context for Gauss’s theory. We let 
D be the discriminant of the binary quadratic forms in question, and we assume 
that D is “fundamental.” Let R be the ring of algebraic integers that lie in the 
field Q(/D ). It turns out to be possible to define a notion of “strict equivalence” 
on the set of ideals of R in such a way that multiplication of ideals descends to a 
multiplication of strict equivalence classes. The strict equivalence classes of ideals 
then form a group, and this group is isomorphic to Gauss’s group. In particular, 
one obtains the nonobvious conclusion that the set of strict equivalence classes 
of ideals is finite. The main result giving this isomorphism is Theorem 1.20. 
This rephrasing of the theory points to a generalization to algebraic number fields 
of degree higher than 2 and is a starting point for modern algebraic number theory. 

Now we return to the work of Gauss. Even the example with D = —20 that was 
described above does not give an idea of how complicated matters can become. 
For discriminant —56, for example, the two forms x* + 14y* and 2x*-+7y? take on 
the same residues modulo 56 that are prime to 56, but no prime can be represented 
by both forms. These two forms and the forms 3x? + 2xy + Sy? represent the 
four proper equivalence classes. By contrast, there are only three equivalence 
classes in Lagrange’s sense, and we thus get some insight into why Legendre 
encountered difficulties in defining a useful multiplication even for D = —56. 
Gauss’s theory goes on to address the problem that x? + 14y? and 2x” +7y? take 
on one set of residues modulo 56 and prime to 56 while 3x” + 2xy + Sy take 
on a disjoint set of such residues. Gauss defined a “genus” (plural: “genera’’) 
to consist of proper equivalence classes like these that cannot be distinguished 
by linear congruences, and he obtained some results about this notion. Gauss’s 
set of genera inherits a group structure from the group structure on the proper 
equivalence classes of forms, and the group structure for the genera enables one 
to work with genera easily. 
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The third jewel of classical number theory to be discussed in this chapter is 
Dirichlet’s celebrated theorem on primes in arithmetic progressions, given below 
as Theorem 1.21. The statement is that if m and b are positive relatively prime 
integers, then there are infinitely many primes of the form km + b with k a positive 
integer. The proof mixes algebra, a little real analysis, and some complex analysis. 

What is not immediately apparent is how this theorem fits into a natural 
historical sequence with Gauss’s theory of binary quadratic forms. In fact, the 
statement about primes in arithmetic progressions was thrust upon Dirichlet in at 
least two ways. Dirichlet thoroughly studied the work of those who came before 
him. One aspect of that work was Legendre’s progress toward obtaining quadratic 
reciprocity; in fact, Legendre actually had a proof of quadratic reciprocity except 
that he assumed the unproved result about primes in arithmetic progressions for 
part of it and argued in circular fashion for another part of it. Another aspect 
of the work Dirichlet studied was Gauss’s theory of multiplication of proper 
equivalence classes of forms, which Dirichlet saw a need to simplify and explain; 
indeed, a complete answer to the representability of composite numbers requires 
establishing theorems about genera beyond what Gauss obtained and has to make 
use of the theorem about primes in arithmetic progressions. 

In addition, Dirichlet asked and settled a question about proper equivalence 
classes for which Gauss had published nothing and for which Jacobi had conjec- 
tured an answer: How many such classes are there for each discriminant D? Let 
us call this number the “Dirichlet class number,’ denoting it by h(D). Dirichlet’s 
answer has several cases to it. When D is fundamental, even, negative, and not 
equal to —4, the answer is 


h(D) = 


’ 


2./|D/4| 3 D/4\ 1 
a = njn 
GCD(n, D)=1 
with the sum taken over positive integers prime to D. Here when p is a prime 
not dividing D, (72) is +1 if D/4 is a square modulo p and is —1 if not. For 
k 
general n = [| p* prime to D, (74) is the product of the expressions (22) 
corresponding to the factorization? of n. When D = —4, the quantity on the right 
side has to be doubled to give the correct result, and thus the formula becomes 


wont E (iat p soe 
UT noddei S/S Fy cage Ht 


The adjusted formula correctly gives h(—4) = —1, since Leibniz had shown 
more than a century earlier that 1 ; i 4 +++» = 7. Dirichlet was able to 


3The expression (24) is called a “Jacobi symbol.” See Problems 9-11 at the end of the chapter. 
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evaluate the displayed infinite series for general D as a finite sum, but that further 
step does not concern us here. The important thing to observe is that the infinite 
series is always an instance of a series Sie x(n)/n with x a periodic function 
on the positive integers satisfying x (m+n) = x(m)x(n). Dirichlet’s derivation 
of a series expansion for his class numbers required care because the series is only 
conditionally convergent. To be able to work with absolutely convergent series, 
he initially replaced 1 by + for s > 1, thus initially treating series he denoted by 
L(s, x) = Vie x@)/n’. 

As a consequence of this work, Dirichlet was familiar with series L(s, x) and 
was aware of the importance of expressions L(1, x), knowing that at least when 
x(n) = (7), L(, x) is not 0 because it is essentially a class number. This 
nonvanishing turns out to be the core of the proof of the theorem on primes in 
arithmetic progressions. Dirichlet would have known about Euler’s proof that 
the progressions 4n + 1 and 4n + 3 contain infinitely many primes, a proof 
that we give in Section 8, and he would have recognized Euler’s expression 
peasy (—1)"/(2n + 1) as something that occurs in his formula for h(—4). Thus 
he was well equipped with tools and motivation for a proof of his theorem on 
primes in arithmetic progressions. 


2. Quadratic Reciprocity 


If p is an odd prime number and a is an integer with a 0 mod p, the Legendre 
symbol G) is defined by 


(<) +1 if a is a square modulo p, 
a | if a is not a square modulo p. 


Since F* is a cyclic group of even order, the squares form a subgroup of index 2. 
Therefore a t> (5) is a group homomorphism of F* into {+1}, and we have 
(5) (?) = (*2) whenever a and b are not divisible by p. 

Theorem 1.2 (Law of Quadratic Reciprocity). If p and qg are distinct odd 
prime numbers, then 


(a) (=) Ser? 
P 

(b) (=) S (—1)8?-D 
P 


(c) (2) (4) = (—1)b0-DIE@-D1. 
q/\P 
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REMARKS. Conclusion (a) is due to Fermat and says that —1 is a square modulo 
p if and only if p = 4n + 1. We proved this result already in Section 1 and will 
not re-prove it here. Conclusion (b) is due to Euler and says that 2 is a square 
modulo p if and only if p = 8n + 1. Conclusion (c) is due to Gauss and says 
that if p or g is 4n + 1, then (7) = (2) and otherwise (7) = =(7). The proofs of 
(b) and (c) will occupy the remainder of this section. 


EXAMPLES. 


(1) This example illustrates how quickly iterated use of the theorem decides 
whether a given integer is a square. We compute (2). We have 


(35) = (i) = Ga) = (it) = Gi) =“) =) -G) = 


the successive equalities being justified by using (c), the formula (4) = (5); 
(c) again, (2) = (5) again, the formula (5) (°) = (%) and (b), (c) once more, 
(4) = (5) once more, and an explicit evaluation of (3). 


(2) Lemma 9.46 of Basic Algebra asserts that 3 is a generator of the cyclic 
group F* when n is prime of the form 2?" + 1 with N > 0, and Theorem 1.2 
enables us to give a proof. In fact, this m has n = 2 mod 3 and n = 1 mod 4. 
Thus (°) = (5) a (3) = —1. Since F* is a cyclic group whose order is a power 
of 2, every nonsquare is a generator. Thus 3 is a generator. 


We prove two lemmas, give the proof of (b), prove a third lemma, and then 
give the proof of (c). 


Lemma 1.3. If p is an odd prime and a is any integer such that p does not 
divide a, then a2?-) = (5) mod p. 


PROOF. The multiplicative group F, being cyclic, let b be a generator. Write 


a =b' mod pforsome integerr. Since (5) = (-1) anda2?-) = (p")2-) = 
(b2—-))r = (—1)" mod p, the lemma follows. 


Lemma 1.4 (Gauss). Let p be an odd prime, and let a be any integer such that 
p does not divide a. Among the least positive residues modulo p of the integers 
a, 2a, 3a,..., 5(p — 1)a, let n denote the number of residues that exceed p/2. 
Then ($) = (-1)". 
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PROOF. Let r1,...,/n be the least positive residues exceeding p/2, and let 
S1,..., 5¢ be those less than p/2, so thatn +k = 5(p — 1). The residues 
T1,.+-5l pn, S],--+, Sp are distinct, since no two of a, 2a, 3a,..., 5(p — 1)a differ 


by a multiple of p. Each integer p — r; is strictly between 0 and p/2, and we 
cannot have any equality p —r; = s;,sincer; +s; = p would mean that (u+ v)a 
is divisible by p for some integers u and v with 1 < u,v < +( p — 1). Hence 


DT 150045 P —Trs S15 +065 Sk 
is a permutation of 1, ..., 5(p — 1). Modulo p, we therefore have 
1-2---4(p 1) S (HV) + rns. SK 
= (-1)"a-2a---}(p— Va 
= (—1)"a2?-Yy E OtaAts L(p 24) 


and cancellation yields a 200-1) = (—1)" mod p. The result follows by combin- 
ing this congruence with the conclusion of Lemma 1.3. 


PROOF OF (b) IN THEOREM 1.2. We shall apply Lemma 1.4 with a = 2 after 
investigating the least positive residues of 2, 4,6,..., p—1. We can list explicitly 
those residues that exceed p/2 for each odd value of p mod 8 as follows: 


p=8k+1, 4k+2,4k4+4,...,8k, 

p=8k+3, 4k+2,4k4+4,...,8k +2, 
p=8k+5, 4k+4,...,8k+2,8k 
p=8k+7, 4k+4,...,8kK+4, 8k +6. 


If n denotes the number of such residues for a given p,a count of each line of the 
above table shows that 


n = 2k and (—1)"=+1 for p=8k+1, 
n=2k+1 and (—1)"=-1 for p = 8k +3, 
n=2k+1 and (—1)"=-l for p=8k+5, 
n=2k+2 and (—1)"=+1 for p= 8k +7. 


Thus Lemma 1.4 shows that (2) = +1 for p = 8k +1 and (=) = —1 for 
p = 8k +3. This completes the proof of (b). 
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Lemma 1.5. If p is an odd prime and a is a positive odd integer such that p 
3(p-l) 
2 
does not divide a, then (5) = (—1)',wheret = > [ua/p]. Here [-] denotes 
u=1 
the greatest-integer function. 


REMARKS. When a = 2, the equality (5) = (—1)’ fails for p = 3, since 
t = [2/3] =0. 


PROOF. With notation as in Lemma 1 4 and its proof, we form each wa for 1 < 
u< 5 (p—1) and reduce modulo p, obtaining as least positive residue either some 
r; fori < norsome s; for j < k. Then ua/p = [ua/p]+ p~'(some rj OF $;). 
Hence 

5(p—1) 5(p-1) 


Ye ua= Xu plua/p\+ on + = Sj. (*) 
u=1 
The proof of Lemma 1.4 showed that p—r,..., D—In, $1, ..., Sg1S a permutation 
of 1,..., S(p — 1), and thus the sum is the same in the two cases: 


5(p-1) 


Xu «=D o-m+ Ly =m dnt ds, 


i=l 
Subtracting this equation from (*), we obtain 


5$(p-1) 5(p-1) 


(@-1) © u=p/( = lua/p] ~n) +2 Yori 
u=1 


L(p— 
Replacing 520 ” won the left side by its value 7 (p*—1) and taking into account 
that p is odd, we obtain the following congruence modulo 2: 


53(p-l) 
(a—k(p?-1)= YO [ua/p]—n mod 2. 


u=1 
Since a is odd, the left side is congruent to 0 modulo 2. Therefore n = 
s20- 7 [ua/p] =t mod 2, and Lemma 1.4 allows us to conclude that (—1)' = 


(—1)" = (§). 


PROOF OF (c) IN THEOREM 1.2. Let 


={@,y)eZxZll<x<}(p—landi<y<5q@—-)}, 
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the number of elements in question being |S| = i(p — 1)(¢q — 1). We can write 
S = S; U Sp disjointly with 


Si ={@, y)|qx > py} and S2={(x, y)| qx < py}; 


the exhaustion of S by S; and Sz follows because gx = py would imply that 
p divides qx and hence that p divides x, contradiction. We can describe S; 
alternatively as 


S) ={(x, y)|1 <x <45(p—l)and1 <y <qx/p}, 


Lop 
es [qx /p], which is the integer t in Lemma 1.5 such 


and therefore |S,| = }°?_, 
ee 
that (—1)/ = (). Similarly we have |52| = yo ° 


t in Lemma 1.5 such that (—1)’ = CG); Therefore 


[py/q], which is the integer 


(—1)4-D@-D = (-1)!5! = (—1)!Si! (1)! = (2) (2), 


and the proof is complete. 


3. Equivalence and Reduction of Quadratic Forms 


A binary quadratic form over Z is a function F (x, y) = ax? + bxy + cy’ from 
Z x Zto Z witha, b, cin Z. Following Gauss,’ we abbreviate this F as (a, b,c). 
We shall always assume, without explicitly saying so, that the discriminant 
D = b* — 4ac is not the square of an integer and that F is primitive in the sense 
that GCD(a, b, c) = 1. When there is no possible ambiguity, we may say “form” 
or “quadratic form” in place of “binary quadratic form.” 


Let (; i) be a member of the group GL(2, Z) of integer matrices whose 


inverse is an integer matrix. The determinant of such a matrix is +1. We can use 
this matrix to change variables, writing 


Oe op x’ \ _ fax’ + By’ 
Mp ON PRI NYA Se Oy 
Then ax? + bxy + cy” becomes 


a(ax' + By’)? + b(ax’ + By’)(yx' + dy’) + c(yx' + by’)? 
= (aa’+bay +cy”)x’*+(2aaB+bad+bBy +2cy5)x'y'+(aB>+bBd+c8")y”. 


4 Disquisitiones Arithmeticae, Article 153. Actually, Gauss always assumed that the coefficient 
of xy is even and consequently wrote (a, b, c) for ax? + 2bxy + cy?. To study x? +xy4 y?, for 
example, he took a = 2, b = 1, c = 2. The convention of working with ax? + bxy + cy? is due to 
Eisenstein. 
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2a 


If we associate the triple (a,b,c) of F(x, y) to the matrix ( ‘ ey. then this 


formula shows that the triple (a’, b’, c’) of the new form F’(x’, y’) is associated 


to the matrix 
7 ames a ms i Sees 2a bd a £B 
b' WwW) \B 54 b 2Ww)\y 8/)° 


From this equality of matrices, we see that 


(i) the member (; °) of GL(2, Z) has the effect of the identity transforma- 
tion, 


(ii) the member B ) (| oY of GL(2, Z) has the effect of applying first 


a p a’ Bp’ 
& 4 and then (°, ): 


These two facts say that we do not quite have the expected group action on forms 
on the left. Instead, we can say either that we have a group action on the right 
or that gF is obtained from F by operating by g’. Anyway, there are orbits, 
and they are what we really need. The discriminant D = b* — 4ac of the form 


F is evidently minus the determinant of the associated matrix e iat and the 


displayed equality of matrices thus implies that the discriminant of the form F’ 
is D(ad — By)*. Since («5 — By)* = | for matrices in GL(2, Z), we conclude 
that 


(iii) each member of GL(2, Z) preserves the discriminant of the form. 


Hence the group GL(2, Z) acts on the forms of discriminant D. 

Forms in the same orbit under GL(2, Z) are said to be equivalent. Forms in 
the same orbit under the subgroup SL(2, Z) are said to be properly equivalent. 
A proper equivalence class of forms will refer to the latter relation. This notion 
is due to Gauss. Equivalence under GL(2, Z) is an earlier notion due to Lagrange, 
and we shall refer to its classes as ordinary equivalence classes on the infrequent 
occasions when the notion arises. Proper equivalence is necessary later in order 
to get a group operation on classes of forms. If one form can be carried to another 
form by a member of GL(2, Z) of determinant —1, we say that the two forms are 


improperly equivalent. Use of the matrix ( , ) shows that the form (a, b, c) is 


0 
improperly equivalent to the form (a, —b, c). In particular, (a, 0, c) is improperly 
equivalent to itself. 

The discriminant D is congruent to b” modulo 4 and hence is congruent to 0 
or | modulo 4. All nonsquare integers D that are congruent to 0 or 1 modulo 4 
arise as discriminants; in fact, we can always achieve such a D with a = | and 
with b equal either to 0 or to 1. 


Cc 


The discriminant is minus the determinant of the matrix tes sf ) associated to 
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a, b, c), and this matrix is real symmetric with trace 2(a+c). Since D = b?—4ac 
( y 


is assumed not to be the square of an integer, neither a nor c can be 0. 
2a b 
b 2c 
of opposite sign. In this case the Dirichlet class number of D, denoted by 


h(D), is defined to be the number? of all proper equivalence classes of forms of 
discriminant D. 
If D < 0, then a and c have the same sign. The matrix ( 


If D > 0, the symmetric matrix ( ) is indefinite, having eigenvalues 


2a b 
b 2c 
definite if a and c are positive, and it is negative definite if a and c are negative. 


Correspondingly we refer to the form (a, b, c) as positive definite or negative 


) is positive 


2a b : tas : 2a b 
foe ) g is positive definite whenever ( ‘ .) 


is positive definite, any form equivalent to a positive definite form is again positive 
definite. A similar remark applies to negative definite forms. Thus “positive 
definite” and “negative definite” are class properties. For any given discriminant 
D < 0, the Dirichlet class number of D, denoted by h(D), is the number® of 
proper equivalence classes of positive definite forms of discriminant D. 

The form (a, b, c) represents an integer m if ax? + bxy +cy” = m is solvable 
for some integers x and y. The form primitively represents m if the x and 
y with ax? + bxy + cy? = m can be chosen to be relatively prime. In any 
event, GCD(x, y) divides m, and thus whenever a form represents a prime p, it 
primitively represents p. 


definite in the two cases. Since g’ ( 


Theorem 1.6. Fix a nonsquare discriminant D. 


(a) The Dirichlet class number /(D) is finite. In fact, any form of discriminant 
D is properly equivalent to a form (a, b,c) with |b| < |a| < |c| and therefore 
has 3|ac| < |D|, and the number of forms of discriminant D satisfying all these 
inequalities is finite. 

(b) An odd prime p with GCD(D, p) = 1 is primitively representable by some 
form (a, b, c) of discriminant D if and only if (?) = +1. In this case the number 
of proper equivalence classes of forms primitively representing p is either | or 2, 
and these classes are carried to one another by GL(2, Z). In fact, if (2) =-+1, 


then b* = D mod 4p for some integer b, and representatives of these classes may 


be taken to be (p, +b, P), 


>This number was studied by Dirichlet. According to Theorem 1.20 below, it counts the “strict 
equivalence classes” of ideals in a sense that is introduced in Section 7. This number either equals or 
is twice the number of equivalence classes of ideals in the other sense that is introduced in Section 7. 
The latter is what is generalized in Chapter V in the subject of algebraic number theory, and the latter 
is how “class number” is usually defined in modern books in algebraic number theory. Consequently 
Dirichlet class numbers sometimes are twice what modern class numbers are. We use “Dirichlet 
class numbers” in this chapter and change to the modern “class numbers” in Chapter V. 

This number was studied by Dirichlet. See the previous footnote for further information. 
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We come to the proof after some preliminary remarks and examples. The 
argument for (a) is constructive, and thus the forms given explicitly in (b) can 
be transformed constructively into properly equivalent forms satisfying the con- 
ditions of (a). Hence we are led to explicit forms as in (a) representing p. A 
generalization of (b) concerning how a composite integer m can be represented if 
GCD(D, m) = 1 appears in Problem 2 at the end of the chapter. What is missing 
in all this is a description of proper equivalences among the forms as in (a). We 
shall solve this question readily in Proposition 1.7 when D < 0. For D > 0, the 
answer is more complicated; we shall say what it is in Theorem 1.8, but we shall 
omit some of the proof of that theorem. 


EXAMPLES. 

(1) D = —4. Theorem 1.2a shows that the odd primes with (3) = +1 are 
those of the form 4k + 1. Theorem 1.6a says that each proper equivalence class 
of forms of discriminant —4 has a representative (a, b, c) with 3|ac| < 4. Since 
D <0, we are interested only in positive definite forms, which necessarily have 
a and c positive. Thus a = c = 1, and we must have b = 0. So there is only 
one class of (positive definite) forms of discriminant —4, namely x? + y?, and 
Theorem 1.6b allows us to conclude that x? + y* = p is solvable for each prime 
p = 4k +1. In other words, we recover the conclusion of Proposition 1.1 as far 
as representability of primes is concerned. 


(2) D = —20. To have (?) = +1 for an odd prime p, we must have either 


(=) = (°) = +1 or (=) = (>) = —1. Theorem 1.2 shows in the first case 
that p = 1 mod 4 and p = +1 mod 5, while in the second case p = 3 mod 4 
and p = +3 mod 5. That is, p is congruent to one of 1 and 9 modulo 20 in the 
first case and to one of 3 and 7 modulo 20 in the second case. Let us consider 
the forms as in Theorem 1.6a. We know that a > 0 and c > 0. The inequality 
3ac < |D| forces ac < 6. Since |b] < a < c, we obtain a* < 6 anda < 2. 
Since 4 divides D, b is even. Then b = 0 or b = +2. So the only possibilities 
are (1,0, 5) and (2, +2, 3). Because of Theorem 1.6b, any prime congruent to 
one of 1, 3, 7, 9 modulo 20 is representable either by (1, 0, 5) and not (2, +2, 3), 
or by (2, +2, 3) and not (1, 0,5). We can write down all residues modulo 20 for 
x? + 5y? and 2x? + 2xy + 3y7, and we find that the possible residues prime to 
20 are 1 and 9 in the first case, and they are 3 and 7 in the second case. The 
conclusion for odd primes p with GCD(20, p) = 1 is that 


p = 1or9 mod 20 implies p is representable as xe Oye, 


p =3 or7 mod 20 implies Pp is representable as 2x? + 2xy + 3y. 


The residues modulo 20 have shown that x” + 5y is not equivalent to either of 
2x? + 2xy + 3y7, but they do not show whether 2x? + 2xy + 3y? are properly 
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equivalent to one another. Hence the Dirichlet class number h(—20) is either 2 
or 3. It will turn out to be 2. 


(3) D = —56. To have (>) = +1 for an odd prime p, we must have an 


odd number of the Legendre symbols (=), (Z), and (7) equal to +1 and the 
rest equal to —1. We readily find from Theorem 1.2 that the possibilities with 


GCD(56, p) = Lare 
p =1,3,5,9, 13, 15, 19, 23, 25, 27, 39, 45 mod 56. 


Applying Theorem 1.6a as in the previous example, we find that x? + 14y’, 
2x? + Ty*, and 3x? + 2xy + Sy? are representatives of all proper equivalence 
classes of forms of discriminant —56. Taking into account Theorem 1.6b and the 
residue classes of these forms modulo 56, we conclude for odd primes p that 


if p =any of 1,9, 15, 23, 25, 39 mod 56, then 

p is representable as x? + 14y? or 2x? + Ty’, 
if p = any of 3,5, 13, 19, 27, 45 mod 56, then 

p is representable as both of 3x7 + 2xy+5 y 


The question left unsettled by the argument so far is whether x” + 14y? is properly 
equivalent to 2x” + 7y*. Equivalent forms represent the same integers, and the 
integer 1 is representable by x” + 14y7 but not by 2x7 +7y7. Hence the two forms 
are not equivalent and cannot be properly equivalent. According to Theorem 
1.6b, the primes of the first line are therefore representable by either x? + 14y? or 
2x? + Ty? but never by both. Hence the Dirichlet class number h(—56) is either 
3 or 4. It will turn out to be 4. 

(4) D = 5. The forms of discriminant 5 are indefinite. Applying Theorem 
1.6a, we obtain 3|ac| < 5. Hence |a| = |c| = 1. Since D is odd, b is odd. The 
inequality |b| < |a| thus forces |b] = 1. Then D = 1 — 4ac shows that ac < 0. 
The possibilities are therefore (1, +1, —1) and (—1, +1, 1). The Dirichlet class 
number (5) is at most 4. It will turn out to be 1. Let us take this fact as known. 
The odd primes p with (>) = +1 are p = 5k +1. Under the assumption that the 
class number is 1, Theorem 1.6b shows that every such prime is representable as 
Ie as xy — rae 


PROOF OF THEOREM 1.6a. We consider the effect of two transformations in 


SL(2, Z), one via (; = and the other via (!” Under these, the matrix 


associated to (a, b, c) becomes 


Ce lea ay eae) 
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na 1 0 2a b 1 on\_ 2a 2an+b 
n 1 b 2c 0 1) \2an+b 2an?+2bn +2c )’ 


respectively. Thus the transformations are 


(a,b,c) +> (c, —b, a), (*) 
(a,b,c) H > (a, 2an + b,c’). (3) 


Possibly applying () allows us to make |a| < |c| while leaving |b| alone. Since 
a # 0, we can apply (**) with n the closest integer to —# to make |b| < |a|. 
This step possibly changes c. Thus after this step, we again apply (+) if necessary 
to make |a| < |c|, and we apply (**) again. In each pair of steps, we may assume 
that |b| strictly decreases or else that n = 0. We cannot always be in the former 
case, since |b| is bounded below by 0. Thus at some point we obtain n = 0. At 
this point, c does not change, and thus we have |b| < |a| < |c|, as required. 
The inequalities |b] < |a| < |c| imply that 


4lac| = |D — b*| < |D| + |b)? < |D| + lacl, 
and hence 3|ac| < |D|. Since neither a nor c is 0, it follows that the inequalities 


|b| < |a| < |c| imply that |a], |b], |c| are all bounded by |D|. Therefore the 
Dirichlet class number /(D) is finite. 


PROOF OF NECESSITY IN THEOREM 1.6b. Suppose x and y are integers with 
GCD(x, y) = 1 and ax* + bxy + cy? = p. Then ax* + bxy + cy? =0 mod p. 
Choose u and v with ux + vy = 1. Routine computation shows that 


A(ax?+bxy + cy’) (av? — buv + cu’) 
= [u(xb + 2yc) — v(2xa + yb)/ — (b* — 4ac)(xu + yu)’ 
= [u(xb + 2yc) — v(2xa + yb)[ — (b? — 4ac), 


and hence 


0 = [u(xb + 2yc) — v(2xa + yb)]? — (b* — 4ac) mod p. 


Consequently D = [u(xb + 2yc) — v(2xa4 yb)]? mod p, and D is exhibited as 
a square modulo p. 


PROOF OF SUFFICIENCY IN THEOREM 1.6b. Choose an integer solution b of 
b? = D mod p. Since b + p is another solution and has the opposite parity, 
we may assume that b and D have the same parity. Then b? = D mod p and 
b* = D mod 4, so that b> = D mod 4p. Since GCD(D, p) = 1, p does not 
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divide b, and the forms ( p, +b, y-P) are primitive. They have discriminant 
b? — 4p mo = D, they take the value p for (x, y) = (1,0), and they are 


improperly equivalent via ( : a ) . Thus the forms in the statement of the theorem 
exist. 

For the uniqueness suppose that a form (a, b, c) of discriminant D represents 
p, say with axg +bxoyo +cy4 = p. Since this representation has to be primitive, 


we know that GCD(xo, yo) = 1. Put (*) = a) and choose integers 6 


and 6 such that ad — By = 1. Then Ge has determinant 1 and satisfies 


; 2a b 
Cc 3) (3) = Ga) The equality axe + bxoyo + cY6 = 5 (xo yo) ( : e) SS) 
therefore yields 


rte (5 SF a)(> 8) (0): 


Consequently the form (a’, b’, c’) associated to the matrix c a G - ) (; ) 


takes on the value p at (x, y) = (1, 0) and is properly equivalent to (a, b,c). In 
particular, it is a form (p, b’, c’) for some b’ and c’ such that b’” — 4pc’ = D. 
Thus in the proof of uniqueness, we may assume that we have two forms 
(p, b’, c’) and (p, b”, c”) of discriminant D. Then b’? = D = b” mod 4p. The 
conditions b”* = b’? mod p and b’” = b” mod 4 imply that b” = +b’ mod p 
and b” = b’ mod 2 for one of the choices of sign. Thus b” = +b’ mod 2p for 
that choice of sign. Let us write b” = +b’ + 2np for some integer n. The matrix 


equality 
1 0 2p +b’ 1on\_ 2p 2pn+b' 
n 1 tb’ 2c’ 0 1) \2pn+b' 2(*) 


shows that (p, +b’, c’) is properly equivalent to (p, b”, *). Since the discriminant 
has to be D, we conclude that * = c”. Thatis, (p, b”, c”) is properly equivalent to 
(p, +b’, c’) for that same choice of sign. Since (p, b’, c’) is improperly equivalent 
to (p, —b’, c’), the proof of the theorem is complete. 


Our discussion of representability of primes p by binary quadratic forms 
of discriminant D when GCD(D, p) = 1 will be complete once we have a 
set of representatives of proper equivalence classes with no redundancy. For 
discriminant D < 0, this step is not difficult and amounts, according to Theorem 
1 .6a, to sorting out proper equivalences among forms (a, b, c) with b? —4ac = D 
and |b| < |a| < |c|. Let us call a form with D < 0 reduced when it satisfies 
these conditions. 
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There are two redundancies that are easy to spot, namely 


(a, b, a) is properly equivalent to (a, —b, a) via ( 


(a, a, c) is properly equivalent to (a, —a, c) via GG 


The result for D < 0 is that there are no other redundancies among reduced 
forms. 


Proposition 1.7. Fix a negative discriminant D. With the exception of the 
proper equivalences of 


(a,b,a) to (a,—b,a) 


and (a,a,c) to (a,-—a,c), 


no two distinct reduced positive definite forms of discriminant D are properly 
equivalent. 


PROOF. Suppose that (a, b, c) is properly equivalent to (a’, b’, c’), that both 
are reduced, and that a > a’ > 0. For some 6 s in SL(2, Z), we have 


a' =aa* + bay + cy”. Hence the inequalities c > a and |b| > —a imply that 
2 2 2 (432 2p 0 
a>aa’+bay+cy* > a(a°+y*)+bay > a(a°+y*)—alay| > alay|, (*) 


and ay equals 0 or +1. Thus the ordered pair (a, y) is one of (0, £1), (+1, 0), 


(+1, 1), (+1,-1). Multiplying G i) if necessary by ‘& oF which acts 
trivially on quadratic forms, we may assume that (a, y) is one of (0, 1), (1,0), 
(1, £1). We treat these three cases separately. 

Case 1. (a, y) = (0, 1). The condition a5 — By = 1 forces By = —1,and the 
formula b' = 2aaB + bad + bBy + 2cy6 gives (a’, b’, c’) = (c, —b + 2c6, *). 
Since |b] < c and |b — 2cd| < c, we must have |6| < 1. If 6 = 0, we are 
led to (a’, b’, c’) = (c, —b, a), which is reduced only if c = a, and this is the 
first of the two allowable exceptions. If || = 1, the triangle inequality gives 
2c = |2cd| < |b| + |2cé — b| < c+ c¢ = 2c, and therefore |b] = c = |b — 2cd]. 
Then b = —(b — 2cd), and b = cé = +c. Since |b| < a < c,b = +a also. 
Hence (a’, b’,c’) = (a, —b, a), and this is again the first of the two allowable 
exceptions. 

Case 2. (a, y) = (1,0). The condition wé — By = 1 forces v6 = 1, and thus 
(a’', b', c') = (a, b + 2aB, *). Since |b] < a and |b + 2aB| < a, we must have 
[B| < 1. If 6 = 0, then (a’, b’,c’) = (a, b,c), and there is nothing to prove. 
If |6| = 1, the triangle inequality gives 2a = |2aB| < | — b| + |2aB + bj, and 
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therefore |b| = a = |b + 2fa|. Then b = —(b + 28a), and we conclude that 
b = —aB = +a and b + 2Ba = +a. Hence the proper equivalence in question 
is of (a, a, c) to (a, —a, c), which is the second of the two allowable exceptions. 

Case 3. (a, y) = (1,+1). From (+) and the assumption that a > a’, we 
have a > a’ > ajay| = a. Thus a = a’, and the definition of a’ shows that 
a=a+by +c. Hence c = —by,andc = |b|. Since |b| < a < c, we obtain 
—by =a=c. The formula hb’ = 2aaB + bad + bBy + 2cy6 then simplifies to 
b! = 2aB + bd + bBy + 2ay5 = (2a + by)(B + y5). From ad — By = 1, we 
have 6 — By = 1 and thus also yd = y + B. Therefore 6B + yd = 28 + y, and 
this cannot be 0. So |b’| > |2a + by| = |2a —a| =a =a’. Since (a’, b’, c’) 
is reduced, |b'| = a’ = a = c = |b|, and the proper equivalence is of (a, a, a) 
to (a, —a, a). This is an instance of both allowable exceptions, and the proof is 
complete. 


EXAMPLES, CONTINUED. 


(2) D = —20. We saw earlier that the reduced positive definite forms with 
D = —20 are x* + Sy” and 2x* + 2xy + 3y7,ie., (1, 0,5) and (2, +2, 3). The 
remarks preceding Proposition 1.7 show that (2, 2,3) is properly equivalent to 
(2, —2, 3), and the proposition shows that (1, 0,5) is not properly equivalent to 
(2, 2,3). (We saw this latter conclusion for this example earlier by considering 
residues.) Consequently h(—20) = 2. 

(3) D = —56. We saw earlier that the reduced positive definite forms with 
D = —S6are x* + 14y?, 2x?+7y?, and 3x7+2xy+5y’,ie., (1, 0, 14), (2,0, 7), 
(3, 2,5), and (3, —2,5). Proposition 1.7 shows that no two of these four forms 
are properly equivalent. Consequently h(—56) = 4. 


Let us turn our attention to D > 0. We still have the proper equivalences 
of (a,b,a) to (a, —b,a) and (a,a,c) to (a, —a,c) as in the remarks before 
Proposition 1.7. But there can be others, and the question is subtle. Here are 
some simple examples. 


EXAMPLES WITH POSITIVE DISCRIMINANT. 

(1) D =5. The forms with D = 5 satisfying the inequalities |b| < |a| < |c| of 
Theorem 1 .6a are (1, +1, —1) and (—1, +1, 1). The second standard equivalence 
allows us to discard one form from each pair, and we are left with (1, 1, —1) and 


(—1,—1,1). The first of these two is equivalent to the second via ‘e a = 


( E a . Thus A(5) = 1, as was announced without proof in Example 4 earlier in 
this section. 

(2) D = 13. The forms with D = 13 satisfying the inequalities |b| < |a| < 
|c| of Theorem 1.6a are (1, +1, —3) and (—1,+1,3). The second standard 
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equivalence allows us to discard one form from each pair, and we are left with 
(1, 1, —3) and (—1, —1, 3). The first of these two is equivalent to the second via 


(55) =() 2). Thus n(13) = 1. 


y 6 
(3) D = 21. The forms with D = 21 satisfying the inequalities |b| < |a| < 
|c| of Theorem 1.6a are (1, +1,—5) and (—1,+1,5). The second standard 
equivalence allows us to discard one form from each pair, and we are left with 
(1, 1, —5) and (—1, —1, 5). These are not properly equivalent. In fact, the form 
x? — xy + 5y? is —1 for (x, y) = (1, 0), but x? + xy — 5y* = —1 is not even 
solvable modulo 3. Thus (21) = 2. 


Although the starting data for these three examples are similar, the outcomes 
are strikingly different. The idea for what to do involves starting afresh with the 
reduction question that was addressed in Theorem 1.6a. For discriminant D > 0, 
a different reduction is to be used. The reduction in question appears in Theorem 
1.8a below, but some preliminary remarks are needed to explain the proof. 

Two forms (a, b,c) and (a’, b’, c’) of discriminant D > 0 will be said to be 
neighbors if c = a’ and b+ b’ = 0 mod 2c. More precisely we say in this 
case that (a’, b’,c’) is a neighbor on the right of (a, b, c) and that (a, b, c) is 
a neighbor on the left of (a’, b’,c’). A key observation is that neighbors are 
properly equivalent to one another. In fact, if (a’, b’, c’) is a neighbor on the right 


ap\ (0 =-1 
of (a, b, c), define 3) — a (+b /(2c) 


a y 2a _ b OB ye fe b’ 
B 5 b 2)\y 8) Vo b-boy): 


The lower right entry of this matrix is an even integer, since b + b' = 0 mod 2c 
and since, as a consequence, b + b’ = 0 mod 2. Hence (a, b, c) is transformed 
into (c, b', c’), where c’ = 4(b — b’) He. 

Let us call a primitive form (a, b,c) of discriminant D > 0 reduced when it 


satisfies the conditions 


0<b<VJD and J/D—b <2\a| < /D +b. 


); Then computation gives 


The first inequality shows that b is bounded if D is fixed, and the equality 
—4ac = D? — b* shows that there are only finitely many possibilities for a 
and c. Consequently there are only finitely many reduced forms for given D. 

From |b| < VD, we see that b? < D = b?—4ac andac < 0; thus any reduced 
form has a and c of opposite sign. Then D — b* = —4ac = (2|a|)(2|c|), and it 
follows that 2|a| > D —b implies 2|c| < JD +b and that 2|a| < /D+b 
implies 2|c| < D — b. Consequently 


JD —b <2\c| </D+b. 
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Theorem 1.8. Fix a positive nonsquare discriminant D. 


(a) Each form of discriminant D is properly equivalent to some reduced form 
of discriminant D. 

(b) Each reduced form of discriminant D is a neighbor on the left of one and 
only one reduced form of discriminant D and is a neighbor on the right of one 
and only one reduced form of discriminant D. 

(c) The reduced forms of discriminant D occur in uniquely determined cycles, 
each one of even length, such that each member of a cycle is an iterated neighbor 
on the right to all members of the cycle and consequently is properly equivalent 
to all other members of the cycle. 

(d) Two reduced forms of discriminant D are properly equivalent if and only 
if they lie in the same cycle in the sense of (c). 


REMARKS. Conclusion (d) is the deepest part of the theorem, involving a subtle 
argument that in essence uses the periodic continued-fraction expansion of the 
roots z of the polynomial az? + bz + c if (a, b, c) is a form under consideration. 
We shall prove (a) through (c), omitting the proof of (d), and then we shall return 
to the three examples D = 5, 13, 29 begun just above. 


PROOF OF THEOREM 1 8a. If (a, b, c) is given and is not reduced, let m be the 
unique integer such that 


VD —2\c| < —b +2cm < VD, (x) 
and define (a’, b’, c’) = (c, —b + 2cm, a — bm + cm”). Then 
b? — 4a'c! = (—b + 2cm)” — 4c(a — bm + cm’) 


= b* —4bem + 4c?*m? — 4ac + 4bcm — 4c?2m? = b? — 4ac = D, 


and we observe that a’ = c and that b + b' = 2cm = 0 mod 2c. Consequently 
(a’, b’, c’) is a form of discriminant D and is a right neighbor to (a, b, c). By the 
remarks before the theorem, (a, b, c) is properly equivalent to (a’, b’, c’). 

We repeat this process at least once, obtaining (a”, b”, c’). If |a”| < |a’|, we 
repeat it again, obtaining (a’”, b’”, c’”), and we continue in this way. Eventually 
the strict decrease of the magnitude of the first entry must stop. To keep the 
notation simple, we may assume without loss of generality that |a”| > |a’|. The 
claim is that (a’, b’, c’) is then reduced. 

Put u = /D — b’ and v = b’ — (VD — 2|a'|). The inequalities (*) show that 
u > Oand v > 0. Therefore 


0 <v°+2uv+2u/D = (utv)? —w?+2u/D 
= 4a’? — (D — 2b'/D + b”) +2D — 2b'/D 
= 4a’? + D—b”? = 4a”? — 4a'c’. 
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Since |c’| = |a”| > |a’|, this inequality shows that a’c’ < 0. Therefore b’? = 
D+ 4a'c' < D,and |b'| < VD. 
From a’c’ < 0 and |a’| < |c’|, we see that 4|a’|? < 4|a’c'| = —4a'c' = 


D —b”? < D. Therefore 2\a’'| < VD. The inequality YD — 2|c| < b’ implies 
that /D — b’ < 2|c| = 2|a’|. The right side has just been shown to be < JD, 
and therefore b’ > 0. Hence /D — b’ < 2\a'| < /D < /D +0’. 


PROOF OF THEOREM 1.8b. Suppose that (a, b, c) is reduced and that (a’, b’, c’) 
is a reduced neighbor on the right of (a,b,c). Then we must have a’ = c and 
b +b’ =0 mod 2c. Since D —b’ < 2|a’| and b’ < /D, we have VD — 2|a’| < 
b! < JD. That is, /D — 2|c| < b! < VD. These inequalities in combination 
with the congruence b + b’ = 0 mod 2c show that (a, b, c) uniquely determines 
b’. Since (a’, b’, c’) is to have discriminant D, c’ is uniquely determined also. 

We turn this construction around to prove existence of a right neighbor. Define 
(a’, b’, c’) in terms of (a, b, c) as in the proof of Theorem 1.8a. Then a’ = c, and 
b’ is the unique integer such that b + b’ = 0 mod 2c and 


JD —2I\c| <b! < VD. 


The form (a’, b’,c’) is a right neighbor of (a,b,c), and we are to show that 
(a’, b’, c’) is reduced. 

Since (a, b, c) is reduced, we have /D — b < 2\|c| < J/D+bandb < JD. 
Let m be the integer such that b + b’ = 2m|c|. Addition of the inequalities 
b! — (JD — 2|cl) > 0 and /D + b — 2Ic| > 0 gives 2m|c| = b+b’ > 0, 
and thus m > 0. Hence m — 1 > 0. Addition of the inequalities /D—b>0 
and b’ — (/D — 2|c|) > 0 gives 0 < b! —b + 2\c| = 2b' — (b+ b') + 2Ic| = 
2b’ — 2(m — 1)|c|. Hence 2b’ > 2(m — 1)|c| = 0, and we see that b’ > 0. 
Therefore 0 < b’ < JD. 

The definition of b’ gives D—b’ < 2|c| = 2|a’|. Addition of the inequalities 
2(m — 1)|c| > 0 and /D — b > 0 gives b + b! — 2|c| + /D — b > 0, which 
says that 2|a’| < /D +)’. Therefore (a’, b’, c’) is reduced. 

Let R be the operation of passing from a reduced form (a, b, c) to its unique 
reduced right neighbor (a’, b’, c’). What we have just shown implies that R acts 
as a permutation of the finite set of reduced forms of discriminant D. This set 
being finite, let n be the order of R. Then the set {Rk |0 <k <n-—l}isa 
cyclic group of permutations of the set of reduced forms of discriminant D. The 
existence of a two-sided inverse of R as a permutation implies that each reduced 
form of discriminant D has exactly one left neighbor. Thus the existence and 
uniqueness of neighbors on one side for reduced forms, in the presence of the 
finiteness of the set, implies existence and uniqueness on the other side. 
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PROOF OF THEOREM 1.8c. We continue with R as the operation of passing from 
a reduced form to its unique reduced right neighbor, letting {Rk |0 < k <n—1} 
be the finite cyclic group of powers of R. This group acts on the set of reduced 
forms of discriminant D, and the cycles in question are the orbits under this action. 
To see that each orbit has an even number of members, we recall that a reduced 
form (a, b, c) has a and c of opposite sign. Thus if, for example, a is positive, 
then R'(a, b,c) = (a’, b’, c’) has (—1)'a’ positive. If the orbit of (a, b, c) has 
k members, then R*(a, b,c) = (a, b,c). Consequently (—1)*a has to have the 
same sign as a, and k has to be even. Finally the members of each orbit are 
properly equivalent to one another because, as we observed before the statement 
of the theorem, a form is properly equivalent to each of its neighbors. 


EXAMPLES WITH POSITIVE DISCRIMINANT, CONTINUED. 

(1) D = 5. The forms with D = 5 satisfying the inequalities of Theorem 
1.8a are (1,1, —1) and (—1, 1, 1), and these consequently represent all proper 
equivalence classes. They form a single cycle and are properly equivalent by 
Theorem 1.8c. Thus again we obtain the easy conclusion that (5) = 1. 

(2) D = 13. The forms with D = 13 satisfying the inequalities of Theorem 
1.8aare (1,3, —1) and (—1, 3, 1), which make up asingle cycle. Thus #(13) = 1. 

(3) D = 21. The forms with D = 21 satisfying the inequalities of Theorem 
1.8a are (1,3, —2) and (—2, 3, 1), which make up one cycle, and (—1, 3, 2) and 
(2, 3, —1), which make up another cycle. Thus (21) = 2. 
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The identity (x7 + y7) (x3 + y3) = (x1x2 — y1y2)? + 1 y2 +x2y1), which can be 
derived by factoring the left side in Q(./—1 )[x1, y1, x2, y2] and rearranging the 
factors, readily generalizes to an identity involving any form x7 + bxy + cy? of 
nonsquare discriminant D = b* — 4c. We complete the square, writing the form 
as (x — Sby)* = ty’D and factoring it as (x = sby+ syVD) (x = sby — syVD), 
and we obtain 
(xt + bxiyi + cyt) (xy + bxry2 + cyz) 
= («1X2 — cy y2)? + D(xixX2 — CVI Y2) (K1y2 + x2y1 + by1y2) 
+ c(riy2 + x291 + by ya)”. 

Improving on an earlier attempt by Legendre, Gauss made a thorough inves- 
tigation of how one might multiply two distinct forms of the same nonsquare 
discriminant, not necessarily with first coefficient 1, and Dirichlet reworked the 


theory and simplified it. Out of this work comes the following composition 
formula, of which the above formula is manifestly a special case. 
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Proposition 1.9. Let (a), b, c1) and (a, b, cz) be two primitive forms with the 
same middle coefficient b and with the same nonsquare discriminant D, hence 
with aic, = a2c2 ~ 0. Suppose that j = cia; = cay! is an integer. Then the 
form (a1a2, b, j) is primitive of discriminant D, and it has the property that 


(ayxt + bx1y1 + cyt) (anxz + bx2y2 + cy3) 
= ayan(x1x2 — jyry2)? + b(xix2 — jyiy2)(aixiy2 + arx2y1 + by1 yr) 


+ j(aixiy2 +anx2y1 + byiy2)’. 


REMARKS. Consequently if an integer m is represented by the form (a1, b, c1) 
and an integer n is represented by the form (az, b, cz), then mn is represented by 
the form (a;d, b, j). For example we saw in an example with D = —20 imme- 
diately following the statement of Theorem 1.6 that any prime that is congruent to 
3 or 7 modulo 20 is representable as 2x” + 2x y +3”. If we have two such primes 
p and q, then p is representable by (2, 2, 3) and q is representable by (3, 2, 2). 


The proposition is applicable with 7 = 1 and shows that pq is representable by 
: 2 a & :) changes this form to the 
properly equivalent form (5,0, 1). Thus pq is representable as x7 + 5y. 


(6,2, 1). In turn, substitution using ( 


PROOF. The form (a,a2,b, j) is primitive because any prime that divides 
GCD (a;a), b, j) has to divide either GCD(q,, b, j) or GCD(a, b, j) and then 
certainly has to divide GCD(q@, b, c1) or GCD(az, b, cz). No such prime ex- 
ists, and hence (a1a2, b, j) is primitive. The discriminant of (aja2,b, j) is 
b? — 4jajag = D+ 4aic; — 4jajag = D+ 4aic, — A(cjay')aian = D, 
as asserted, and the verification of the displayed identity is a routine computation. 


Let us say that two primitive forms (a), b;,c,) and (do, bz, c2) of the same 
nonsquare discriminant are aligned if b; = bz and if j = cyaz a C24, ‘is an 
integer. In the presence of equal nonsquare discriminants D and the equal middle 
entries b, the rational number j is automatically an integer if GCD(a,, a2) = 1. 
In fact, the equality D — b? = —4a,c, = —4anc2 shows that D — b? is divisible 
by 4a, and by 4a; since GCD(a;, a2) = 1, D— b? is divisible by 4a,d», and the 
quotient — j is an integer. 

The idea is that each pair of classes of properly equivalent primitive forms 
of discriminant D has a pair of aligned representatives, and a multiplication of 
proper equivalence classes is well defined if the product is defined as the class of 
the composition of these aligned representatives in the sense of Proposition 1.9. 
This multiplication for proper equivalence classes will make the set of classes 
into a finite abelian group. This group will be defined as the “form class group” 
for the discriminant D, except that we use only the positive definite classes in the 
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case that D < 0. Before phrasing these statements as a theorem, we make some 
remarks and then state and prove two lemmas. 

Let (a, b,c) be a form of nonsquare discriminant D, and let b’ be an integer 
with b' = b mod 2a. In this case the number c’ = (b’” — D)/(4a) is an integer; 
in fact, we certainly have the congruences b’” = b” mod 2a and b’? = b? mod 4, 
and thus we obtain the automatic’ consequence b’? = b? mod 4a, the rewritten 
congruence b’? = D + 4ac mod 4a, and the desired result b’* — D = 0 mod 4a. 
Hence (a, b’, c’) is another form of discriminant D. We call (a, b’, c’) a translate 
of (a, b,c). The key observation about translates is that the translate (a, b’, c’) is 
properly equivalent to (a, b, c). This fact follows from the computation 


1 O 2a b 1 1\_ 2a b+ 2al _ (2a Dd 
1 1 b 2%)\O 1) \b+2al 2(al?+bl4+c)) \ BW 2}? 


valid for any integer J. 


Lemma 1.10. If (a, b, c) is a primitive form of nonsquare discriminant and if 
m # O is an integer, then (a, b, c) primitively represents some integer relatively 
prime tom. 


PROOF. Let 


wo = product of all primes dividing a,c, and m, 
Xo = product of all primes dividing a and m but not c, 


yo = product of all primes dividing m but not a. 


Referring to the definitions, we see that any prime dividing m divides exactly 
one of wo, xo, and yo. In particular, GCD(xo, yo) = 1. We shall show that 
GCD(m, axé + bxoyo + cy,) = 1, and the proof will be complete. Arguing by 
contradiction, suppose that a prime p divides GCD(m, ax +bxoyo+ cye). There 
are three cases for p, as follows. 

Case 1. If p divides xo, then the fact that p divides ang + bxoyo + eye implies 
that p divides bys. Since p does not divide yo, p divides c, in contradiction to 
the definition of x9. 

Case 2. If p divides yo, then similarly p divides aXG. Since p does not divide 
Xo, p divides a, in contradiction to the definition of yo. 

Case 3. If p divides wo, then the fact that p divides a and c implies that p 
divides bxoyo. Since p divides neither xg nor yo, p divides b, in contradiction to 
the fact that (a, b, c) is primitive. 


The argument being used here—that a congruence modulo 2a implies the congruence of the 
squares modulo 4a —will be used again later in this section without detailed comment. 
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Lemma 1.11. Suppose that (a1, b, c,) and (a, b, c2) are properly equivalent 
forms of nonsquare discriminant. If / is an integer such that GCD(q, a2,/) = 1 
and such that / divides GCD(c), c2), then (Ja,, b, /~!c,) and (lax, b, I~!) are 
properly equivalent forms. 


REMARK. Even if (a;, b, cy) and (a2, b, co) are primitive, it does not follow 
that (Ja,, b, 1~'c,) and (lay, b, 1~!c>) are primitive. In fact, one need only take 
1 = 2 and (a), b, c1) = (a2, b, cr) = C1, 2, 4). 


PROOF. Since (a1, b, ci) and (a2, b, cz) are properly equivalent, there exists 


(55) with 
a y 2a, b a B\ [2a b 
B 38 b Be ftv bf hb Bay” 


We multiply both sides on the right by (; a , and the result is the system of 
four scalar equations 


2aya + by = 2a76 — by, 
2a,;B + bd = bb — 2coy, 
ba + 2cjy = —2a7B + ba, 
bB + 2c16 = —bB + 20a. 


The second and third equations simplify to a,B + cy =O andaobB+cy =0. 
Since / divides c; and c2, these two simplified equations show that / divides a; 8 
and a2f. Since GCD(a1, a2, 1) = 1, it follows that / divides 6. 


a 1-'B 


Therefore the matrix é: ‘ ) of determinant | has integer entries. Direct 


computation shows that 


a ly 21a, b ao 116 _ [ 2lay b 
h6 8 b 4-'e, ly 6 a b An-'e J 


Consequently the forms (/a,, b, 1~!c,) and (Jaz, b, 1~'cz) are properly equivalent. 


Theorem 1.12. Let D be anonsquare discriminant, and let C; and C2 be proper 
equivalence classes of primitive forms of discriminant D. 


(a) There exist aligned forms (a), b,c) € C; and (a2, b, cz) € C2, and these 


may be chosen in such a way that a; and ay are relatively prime to each other and 
to any integer m 4 0 given in advance. 
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(b) If the product of C; and C2 is defined to be the proper equivalence class 
of the composition of any aligned representatives of C; and C2, as for example 
the ones in (a), then the resulting product operation is well defined on proper 
equivalence classes of primitive forms of discriminant D. 

(c) Under the product operation in (b), the set of proper equivalence classes 
of primitive forms of discriminant D is a finite abelian group. The identity is the 
class of (1,0, —D/4) if D = 0 mod 4 and is the class of (1, 1, —(D — 1)/4) if 
D = 1 mod 4. The group inverse of the class of (a, b, c) is the class of (a, —b, c). 


REMARK. When D < 0, the proper equivalence classes of positive definite 
forms are a subgroup. In fact, if (a,,b,c,) and (a, b, cz) are positive definite 
and are aligned, then a; and a» are positive, and therefore their composition 
(a,a2, b, j) has a,az positive and is positive definite. As was indicated in the 
discussion before Lemma 1.10, the form class group for discriminant D is defined 
to be the group in (c) if D > 0, and it is defined to be the subgroup of classes of 
positive definite forms if D < 0. 


PROOF OF THEOREM 1.12a. By two applications of Lemma 1.10,C; primitively 
represents some integer a; prime to m, and C2 primitively represents some integer 
a2 prime to ajm. Arguing as in the last part of the proof of Theorem 1.6b, we may 
assume without loss of generality that (x, y) = (1, 0) yields these values in each 
case. Then C; contains a form (a,, b;, *) for some b,, and C2 contains a form 
(a2, b2, *) for some b2. By the remarks before Lemma 1.10, C, contains every 
translate (a, b) + 2a,l,, *), and Cz contains every translate (a2, by + 2ag/o, *). 

Let us make specific choices of /; and /,. We know that b} = D = by mod 2, 
so that bz — by is even. The construction of a; and a2 was arranged to make 
GCD(a, a2) = 1, and therefore GCD(2a,, 2a2) = 2. Since bz — by, is even, 
we can choose /; and /2 such that 2a,/; — 2aln = bp — bj. Then b; + 2a,l, = 
by + 2azl2, and we take the common value as b. 

For this b, C, contains the form (a1, b, *), and Cy contains the form (ap, b, *). 
Since we have arranged that GCD(q), a2) = 1, the remark immediately following 
the definition of “aligned” shows that these forms are aligned. 


PROOF OF THEOREM 1.12b. Suppose that 


(a,,b',*) is properly equivalent to (aj, b”, *), 


(a,,b',*) is properly equivalent to (aj, b”, *), 


with the vertical pairs aligned. We are to show that 


(a\a5,b’,*) is properly equivalent to (a/aj,b", *). (*) 


Theorem 1.12a applied to the integer m = a)a4a'a¥ gives us an aligned pair of 


forms (a1, b, *) and (a, b, *) in the respective proper equivalence classes such 
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that GCD(qa1, a2) = 1 and GCD(a,a2, m) = 1. If we can show that 
(a,a},b’,*) is properly equivalent to (a;a2, b, *), (x) 
then we will have symmetrically that 
(aja3,b",*) is properly equivalent to (a)ap, b, *), 


and («) will follow from this fact and (>) by transitivity of proper equivalence. 

We can now argue as in the proof of Theorem 1.12a. We know that b = D = 
b’ mod 2, so that b' — b is even. The construction of a; and a2 was arranged 
to make GCD(a;a2, a\a4) = 1, and therefore GCD(2a,a2, 2a,a}) = 2. Since 
by — by is even, we can choose / and /' such that 2a;a2/ — 2aa‘l' = b' —b. Then 
b + 2ajayl = b’ + 2a\a}l', and we take the common value as B. This B has 


B =b mod 2a,az and B =D’ mod 2a\a\. 


Thus ; ; 
(a1,b,*) is properly equivalent to (a1, B, *), 
(a2, b, *) is properly equivalent to (ao, B, *), (7) 
(a\a2,b,*) is properly equivalent to (aia2, B, *), 


and similarly 


(a,,b’, *) is properly equivalent to (a), B, *), 
(a,,b’,*) is properly equivalent to (a5, B, *), (+4) 


(a\a5,b’,*) is properly equivalent to (aja, B, *). 


By construction of b, (a1, b, *) is properly equivalent to (aj, b’, *). This equiv- 
alence, in combination with the first line of (+) and the first line of (++), shows 
that 

(a;, B,*) is properly equivalent to (a), B, *). (£) 


Let us check that Lemma 1.11 is applicable to the two properly equivalent 
forms of (+) and to the integer / = a4. In fact, GCD(a;, 42,1) = 1 follows 
from GCD(a,a2, aa,) = 1, and the problem is to show that / = a} divides 
(D — B*)/(4a,) and (D — B’)/(4a\). To see this divisibility, we observe that 
D — b’ is divisible by 4a‘a, because (a), b’, *) and (a), b’, *) are given as 
aligned; the congruence b’ = B mod 2a‘a‘, implies that b’? = B* mod 4a\a}, 
and addition gives D — B? = 0 mod 4a‘a,. Meanwhile, D — B? is divisible 
by 4a, because the third member of (a,, B, *) is an integer. Since D — B? is 
divisible also by 4a}a4, and since GCD(a), a;a}) = 1, D— B? is divisible by 
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4aya\a’,. Therefore (D — B*)/(4a;) and (D — B”)/(4a}) are divisible by a’, and 
Lemma 1.11 is indeed applicable. 
The application of Lemma 1.11 to (+) with / = a‘ shows that 


(a,a,, B,*x) is properly equivalent to (aja, B, *). 


Similarly (a2, B, *) is properly equivalent to (a4, B, *), and an application of 
Lemma 1.11 to this equivalence with / = a, shows that 


(a\a2, B, *) is properly equivalent to (aya}, B, x). 
The two results together show that 
(a\a2, B,*) is properly equivalent to (aa), B, *). 


Combining this equivalence with the third line of (+) and the third line of (++), 
we obtain (*«), and the proof of (b) is complete. 


PROOF OF THEOREM 1.12c. The set of proper equivalence classes is finite by 
Theorem 1.6a, and commutativity of multiplication is clear. Define 5 to be 0 if 
D = 0 mod 4 and to be 1 if D = 1 mod 4. Let us see that the class of (1, 6, «) 
is the identity. If (a,b,c) has discriminant D, then b = 6 mod 2, and hence 
(,b,*) = 1,64+2-1- $(b — 6)) is a translate of (1,6, *). Consequently 
(1, b, *) and (1, 6, *) are properly equivalent. Since Proposition 1.9 shows that 
the composition of (a, b, c) and (1, b, *) is (a, b, *), Theorem 1.12b allows us to 
conclude that the class of (1, 5, *) is the identity. 

For inverses Theorem 1.12b shows that the product of the classes of (a, b, c) 
and (a, —b, c) is the product of the classes of (a,b,c) and (c, b, a), which is 
the class of the composition (a, b, c)(c,b, a). Proposition 1.9 shows that this 
composition is (ac, b, 1). Since (ac, b, 1) is properly equivalent to (1, —b, ac) 
and since the latter is properly equivalent to (1, 4, «), the class of the composition 
(a, b, c)(c, b, a) is the identity. 

To complete the proof, we need to verify associativity. Let Ci, C2, and C3 
be three proper equivalence classes of primitive forms of discriminant D. Let 
(a;, b;, c,) be a form in the class C;. Lemma 1.10 shows that C2 represents an 
integer a> prime to a;, and then it follows that the form (a, b2, cz) is in C2 for some 
integers by and cy. A second application of Lemma 1.10 shows that C3 represents 
an integer a3 prime to a; a, and then it follows that the form (a3, b3, c3) is inC3 for 
some integers b3 and c3. The middle components have bj = bp = b3 = 6 mod 2, 
and thus 5 (bj — 6) is an integer for j = 1,2,3. Since aj, a», a3 are relatively 
prime in pairs, the Chinese Remainder Theorem shows that the congruences 

= $(bj — 6) mod a; have a common integer solution x for 7 = 1,2, 3. Define 
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b = 2x + 6. Then b is a solution of b = b; mod 2a; for j = 1,2,3. Write 
b = b; + 2a;n; for suitable integers nj. Then (a;, b, *) = (a;, bj + 2ajnj;, *) 
is a translate of (aj, b;, cj) and consequently is properly equivalent to it. Thus 
(a;, b, *) lies inC;. Taking into account Theorem | .12b and using Proposition 1.9, 
we see that C; (C2C3) and (C,C2)C3 are both represented by the form (a1a2a3, b, *) 
and hence are equal. 


5. Genera 


The theory of genera lumps proper equivalence classes of forms of a given dis- 
criminant according to their values in some way. There are at least two possible 
definitions of “genus,” and it is a deep result that they lead to the same thing 
in all cases of interest. By way of background, we saw in Sections 2 and 3 for 
discriminant D = —56 that the number of proper equivalence classes of binary 
quadratic forms is exactly 4, representatives being x7 + 14y”, 2x? + 7y?, and 
3x*+2xy+5y?. The last two are improperly equivalent and take the same values 
at integer points (x, y), and there are no other improper equivalences. Thus the 
first two take on a disjoint set of prime values from the values of 3x7 +2xy +5y” 
for integer points (x, y), and the sets of prime values taken on by x” + 14y? and 
2x* + 7y” at integer points are disjoint from one another. 

Two possible lumpings of proper equivalence classes arise for this discriminant. 
One is to identify forms when their values modulo 56 include the same residues 
prime to 56. It is just a finite computation to see that 


xe + 14y? and 2x7 + ae take on the residues 1,9, 15, 23, 25, 39, 
3x7 +2xy+ 5y" take on the residues 3,5, 13, 19, 27, 45. 


Thus the first kind of lumping treats x? + 14y? and 2x? + 7y? together because 
of the residues they take on, and it treats 3x7 + 2xy + 5y* and 3x* — 2xy + 5y? 
together. Gauss proceeded by using this kind of lumping to define “genus.” 

The other lumping is to identify integer forms that take on the same rational 
values at rational points. Here 2x* + 7y* = 1 for (x, y) = ( i 4), and of course 
x* + 14y? = 1 for (x, y) = (1,0). Hence the sets of values of x* + 14y? 
and 2x? + 7y? for x and y rational have a nonzero value in common. Lemma 
1.13 below implies that the sets of rational values taken on by the two forms are 
identical. The second kind of lumping treats x* + 14y? and 2x? + 7y* together 
because they take on the same rational values. We shall use this latter kind of 
lumping because, as Theorem 1.14 below shows, this is the definition that more 
quickly identifies the genus group once the form class group is known. 
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Problems 25—40 at the end of the chapter show that the two definitions of genus 
lead to the same thing for discriminants that are “fundamental” in a sense that we 
define in a moment. 

We have defined two forms (a, b, c) and (a’, b’, c’) with integer entries to be 


(73 . os . : Qa B . ‘s 
properly equivalent” if there is a matrix @ ; ) in SL(2, Z) with 


a y 2a Db a B\ (2a Db 
an) b 2W)\Xy S&S) \ BD 2! Je 


We say that two forms (a, b, c) and (a’, b’, c’) with rational entries are properly 
equivalent over Q if there is a matrix c ) in SL(, Q) such that the displayed 


equality holds. For emphasis we can refer to the original notion as “proper 
equivalence over Z” when it is advisable to be more specific. It is evident that if 
two forms with rational entries are properly equivalent over Q, then their sets of 
values at points (x, y) in Q x Q are the same. 


Lemma 1.13. If (a, b,c) is a form with rational coefficients and with non- 
square discriminant D that takes on a nonzero value gq € Q for some (Xo, yo) 
in Q x Q, then (a,b,c) is properly equivalent over Q to (¢,0, —D/(4q)). 
Consequently two forms over Q of the same discriminant that take on a nonzero 
value in common over Q are properly equivalent over Q. 


PROOF. Suppose that axé + bxoyo + cye =q. Put (7) = G 
and yo cannot both be 0, we can choose rationals 6 and 6 such thataé — By = 1. 


Then G ) has determinant 1 and satisfies € _ ( 4 — Ea: The equality 


) . Since xo 


x0 


ax + bxoyo + Cy = 5 (xo yo) io ») ( a therefore yields 


eet (5 5) ae)(y 8)(0): 


It follows that (a, b,c) is properly equivalent over Q to some form (q, b’, c’) 
with b’ and c’ rational. Using a translation with a rational parameter, we see that 
(q, b’, c') is properly equivalent over Q to a form (g,0, *). Inspection of the 
discriminant shows that this last form must be (g,0, —D/(4q)). 


Two primitive integer forms having the same discriminant are said to be in 
the same genus (plural: genera) if they are properly equivalent over Q. In view 
of Lemma 1.13 the condition is that they are primitive and take on a common 
nonzero value over Q, or equivalently that they are primitive and take on the same 
set of values over Q. Thus x? + 14y? and 2x? + 7y* furnish an example of two 
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forms in distinct classes that are in the same genus. Two primitive integer forms 
that are in the same proper equivalence class over Z are in the same genus. The 
genus of the class C will be denoted by [C]. The identity class will be denoted 
by €, and P = [€] is called the principal genus. If (a, b,c) is an integer form 
representing a class C, then Theorem 1|.12c shows that (a, —b, c) represents Gok. 
On the other hand, C and C~! take on the same values over Z, as we see by 
replacing (x, y) by (x, —y), and it follows that [C] = [C7!]. 

For the main theorem about genera, we shall introduce an extra hypothesis on 
the discriminant D. A nonsquare integer D will be said to be a fundamental 
discriminant if D is not divisible by the square of any odd prime and if when 
D is even, D/4 is congruent to 2 or 3 modulo 4. It will be seen later that this 
condition is equivalent to the requirement that D be the “field discriminant” of 
some quadratic number field. Examples of discriminants that are not fundamental 
are D = —12, —44, —108. 

With this condition imposed on D, any integer form (a, b, c) of discriminant D 
is automatically primitive. In fact, no odd prime p can divide GCD(a, b, c), since 
then p? would divide D. If 2 were to divide GCD(a, b, c), then (a/2, b/2, c/2) 
would be an integer form, and D/4 = (b/2)* — 4(a/2)(c/2) would be an integer 
congruent to 1 or 4 modulo 4. 


Theorem 1.14. For a fundamental discriminant D, the principal genus P of 
primitive integer forms® is a subgroup of the form class group H, and the cosets 
of P are the various genera. Thus the set G of genera is exactly the set of cosets 
H/P and inherits a group structure from class multiplication. The subgroup P 
coincides with the subgroup of squares in H, and consequently every nontrivial 
element of G has order 2. 


REMARKS. The group G is called the genus group of discriminant D. The 
hypothesis that D is fundamental is needed only for the conclusion that every 
member of P is a square in H. Since every nontrivial element of G has order 
2 when D is fundamental, application of the Fundamental Theorem of Finitely 
Generated Abelian Groups or use of vector-space theory over a 2-element field 
shows that G is the direct sum of cyclic groups of order 2; in particular, the order 
of G is a power of 2. Problems 25-29 at the end of the chapter show that the 
order of G is 2°, where g + 1 is the number of distinct prime factors of D. 


PROOF. Let V(C) denote the set of Q values assumed by forms in the class C 
at points (x, y) in Q x Q. If S and S’ are two genera and if C is a class in § and 
C’ is aclass in S’, we define S - S’ = [CC’]. 


8s usual, we exclude the negative definite classes in the discussion. 


34 I. Transition to Modern Number Theory 


To see that this product operation is well defined on the set G of genera, let C” 
be in S’ also. Then V(C’) = V(C”). If g is in V(C) and q’ is in V(C’) = V(C’), 
then the prescription for multiplying classes shows that gq’ is in V(CC’) and 
Vv(CC”). Hence V(CC’) = V(CC"), and [CC’] = [CC”]. Therefore multiplication 
of genera is well defined. Define a function g : H > G by g(C) = [C]. Then 
the computation 


g(CC’) = [CC] = [CIIC'] = 9(C)g(C) 


shows that g is a homomorphism of H onto G. The kernel of g is [C] = P, which 
is therefore a subgroup, and the image of ~, which is the set G of genera with its 
product operation, has to be a group. 

For any class C, the equality [C] = [C-!] implies that [C7] = [C][C] = 
[Cc [C-'] = [CC-'] = [E] = P. Hence P contains all squares. Conversely let C 
be in P. Then C takes on the value | over Q. If (a, b, c) is a form in the class C, 
then there exist rationals r and s with ar? + brs + cs” = 1. Clearing fractions, 
we see that there exist integers x and y such that ax* + bxy +cy? =n? for some 
integer n # 0. Without loss of generality, we may assume that n is positive. 
Since (a, b, c) is primitive, a familiar argument allows us to make a substitution 
for which the value n? is taken on at (x, y) = (1,0). In other words, (a, b,c) 
is properly equivalent over Z to a form (n’, b’,c’) for suitable integers b’ and 
c’. The composition formula in Proposition 1.9 shows that the composition of 
(n, b’, c'n) with itself is (n, b’, c’), and hence C is exhibited as the square of the 
class of (n, b’, c’n). Since (n, b’, c'n) has the same discriminant D as (n7, b’, c’) 
and therefore as (a, b,c) and since D is fundamental, (n, b’, c’n) is primitive. 
Therefore C is the square of a class of primitive forms. If C is positive definite, 
then the above choice of the sign of 1 as positive makes (n, b’, c’n) positive 
definite. Hence the class of (n, b’, c’n) is in H. 


EXAMPLE. The discriminant D = —56 is fundamental, and we have seen that 
the form class group is of order 4 with representatives x7 + 14y?, 2x? + 7y?, and 
3x* + 2xy +5y*. We have seen also that x? + 14y? and 2x? + 7y? both lie in the 
principal genus P. A group of order 4 must be isomorphic to the cyclic group 
C4 or to C2 x C2. In the first case the subgroup of squares has order 2, and in the 
second case the subgroup of squares has order 1. Since we have already found 
two elements in P, P has order exactly 2. By the theorem we must be in the first 
case. Hence H is of type C4, and the genus group G is of type C2. It is possible to 
check directly that 3x? + 2xy + 5y” has order 4 by making computations similar 
to those for Problem 4d at the end of the chapter. 
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6. Quadratic Number Fields and Their Units 


In this section we review material about quadratic number fields that appears in 
various places in Basic Algebra, and we determine the units in the ring of integers 
of such a number field. 

Quadratic number fields are extension fields K of Q with [K : Q] = 2. Sucha 
field is necessarily of the form K = Q(./m ), where m is a uniquely determined 
square-free integer not equal to 0 or 1. The set {1, ,/m } is a vector-space basis 
of K over Q. 

The extension K/Q is a Galois extension, and the Galois group Gal(K /Q) 
of automorphisms of K fixing Q has two elements. We denote the nontrivial 
element of the Galois group by o; its values on the members of the vector-space 
basis are o (1) = land o(./m) = —./m. 

The norm N = Nx/g and trace Tr = Trx/g are given by N(~) = a@-a(a) and 
Tr(a) = a+o(a). Thus N(a+b./m) = a? — mb? and Tr(a + b./m) = 2a. 
These values are members of Q. The norm is multiplicative in the sense that 
N(a@B) = N(a)N(B), and N(1) = 1. 

The ring R of algebraic integers in K is the integral closure of Zin K . It works 
out to be 

ZL J/m | if m = 2 or 3 mod 4, 

7 Zi(/m—1)] ifm =1mod4 

and is therefore a free abelian group of rank 2. The automorphism o carries R to 
itself. The norm and trace of any member of R are in Z; conversely any member 
of K whose norm and trace are in Z is in R. We define the algebraic integer 6 to 
be given by 

—./m ifm = 2 or 3 mod 4, 
7 | t(1—.J/m) ifm =1 mod 4. 


Then {1, 5} is a Z basis of R. The norm and trace of 5 are given by 


—m if m = 2 or 3 mod 4, 
i(l—m) ifm =1 mod 4, 


NG) =5-008) =| 


Tr(3) = 8 + 0(8) 0 if m = 2 or 3 mod 4, 
T = = 
> 1 if m = 1 mod 4. 

There is a general notion of field discriminant D, or absolute discriminant, 
for an algebraic number field, whose definition will be given in Chapter V. We 
shall not give that definition in general now but will be content to give the formula 
for D in the quadratic number field Q(./m ), namely 


a 4m if m = 2 or 3 mod 4, 
~ | m ifm = 1 mod 4. 
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The units of K are understood to be the members of the group R* of units in 
the ring R. These are the members ¢ of R with N(e) = +1. In fact, if ¢ is a unit, 
then the equality ee l=] implies that 1 = N(1) = N(ee—!) = N(e)N(e7!) 
and shows that NV (e) is a unit in Z. Thus N(e) = +1. Conversely if N(e) = 
then tea (e) = 1 shows that o(¢) = +e7!; since o(e) is in R, € is exhibited as 
in R* and is therefore a unit. 

For m < 0, the units of Q(,/m ) are easily determined. In fact, if ¢e = a + bd 
with a and b in Z, then N(e) = (a + bd)(a + bo (8)) = a? +b Trh6 +b? N(68) 
with each term equal to an integer and with the end terms > 0. Sorting out the 
possibilities, we see that 


{+1,+/-T} ifm = —1, 
Rea} (slay 3)) itm 3) 
ean, for all other m <0. 


The respective orders of R* are 4, 6, and 2. 
Determination of the units when m > 0 is more delicate. We require a lemma. 


Lemma 1.15. If @ is a real irrational number and if N > 0 is an integer, then 
there exist integers A and B with 


1 
|Ba — A] < — and O<B<N. 
N 
For this A and this B, 


e-sl<e 
B 
PROOF. Put a, = na — [na], where [- ] denotes the greatest-integer function. 
Then 0 < a, < 1. We partition the half-open interval [0, 1) into N subintervals 
[, =), with | < t < N. For0 <n < N, the expression a, takes on N + 1 
distinct values because @, = G@», would imply that (n — m)a is in Z. Hence 


there exist a, and a, with n > m that lie in the same subinterval [s. x): 
Then |a, — Qm| < x If we take B = n — m and A = [na] — [ma], then 
|Ba — A| = |a, — |, and the inequality |Ba — A| = x follows. Dividing this 
inequality by B gives |a — 4| < gy and this is < i petals N>B. 


Proposition 1.16. For K = Q(./m ) with m > 0, the units are the members 
of the infinite group 


= {(De{ |ne Z}=Zxc, 


where €, is the fundamental unit, defined as the least unit > 1. 
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REMARK. For example, when m = 2, the fundamental unit is e; = 1 + V2. 


PROOF. The units w with |w| = 1 are +1, since the members of K are real 
numbers. We shall show shortly that there exists a unit w with |w| ~ 1. Then w or 
«| has absolute value > 1. Letus say that |w| > 1. Then one of wand —wis > 1. 
Let us say that @ > 1. Write w = a+ b,/m, so that o(@) =a — b,/m = ta~! 
has |o(w)| < 1. Then 


[2a] = |o + o(@)| < |o| + |o@)| < lol +1 


and |2b/m| = |w — o(@)| < lol + |o(@)| < Jol +1 


together show that there are only finitely many units w’ with 1 < |w’| < jo. 
Hence the existence of a unit @ with |w| ~ 1 implies the existence of a fundamental 
unit €1. 

If w’ is any unit > 1, then we can choose a power gt of €) with ar >o' > Els 
by the archimedean property of R. Then w’e;" is a unit > 1 with |w’e;"| < €1. 
Since €; is fundamental, w’e," is 1, and thus w’ = ¢/. Then it follows that the 
group of units has the asserted form. 

Thus we need to exhibit some unit @ with |w| ~ 1. We apply Lemma 1.15 
with a = ./m and with N arbitrary. Then we obtain infinitely many pairs (A, B) 
of integers with | /m — 4| < + < 1, hence with |A/B| < 1+ ./m. For each 
such pair (A, B), the member r = A — B./m of R has 


\(A + BYm)(A — BYm)| =|4 — Vm |B?| |4 + Vm 
< ge BL + 2m) = 1+ 2/m. 


Thus there are infinitely many r in R with |N(r)| < 1+ 2./m. Since the norm 
of an algebraic integer is in Z, there is some integer n such that infinitely many 
r € R have N(r) = n. Among the elements r € R with N(r) = n, which 
we write asr = A+ B,/m with A and B in 52, we consider the finitely many 
congruence classes of (A, B) modulo n, saying that two such (A, B) and (A’, B’) 
are congruent if A — A’ and B — B’ are integers divisible by n. Since infinitely 
many r € R have N(r) = n, there must be infinitely many of these in some 
particular congruence class. Take three such, say a1, @2, and a3. Then 


IN (r)| 


N (a1) = N(@2) = N(a3) =n 


with 
a,— a , a, — a3 , 
——— inR and ——— inR. 
n 


Since n = N(a2) = a20 (a2), we see that 
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Thus a /q@ is exhibited as in R, and it has N(a1/a2) = N(a1)/N(a2) =n/n = 
1. Hence a /a2 is a unit different from +1. Arguing similarly with a;/a3, we 
see that a; /a3 is a unit different from +1 and not equal to a,/a2. Hence one of 
a, /a@2 and a@;/a@3 is a unit whose absolute value is not 1. 


7. Relationship of Quadratic Forms to Ideals 


We continue with K as the quadratic number field Q(,/m ) and R as the ring of 
algebraic integers in K. Here R = Z[5], where 6 = —./m if m = 2 or 3 mod 4 
and é = 5(1 —./m) ifm = 1 mod 4. Let D be the field discriminant of Q(./m ) 
as defined in Section 6. 

The topic of this section is a relationship between nonzero ideals in R and 
binary quadratic forms with discriminant D. Binary quadratic forms with D as 
discriminant are automatically primitive. 

The relationship is not a one-one correspondence of ideals to forms but a one- 
one correspondence of a certain kind of equivalence class of ideals to proper 
equivalence classes of forms. We saw in Theorem 1.12 that the latter collection 
has the structure of a finite abelian group, and we shall see in this section that the 
former collection has the natural structure of a finite abelian group as well. The 
correspondence is a group isomorphism, according to Theorem 1.20 below. 

Consider nonzero ideals / in R. The first observation is that J is additively a 
free abelian group of rank 2. In fact, R itself is additively a free abelian group of 
rank 2, and the additive subgroup J has to be free abelian of rank < 2. Ifr isa 
nonzero element in J, then N(r) = ro(r) is in J, and thus J contains a nonzero 
integer. If n is an integer in J, then n/m is in J, and thus J contains a noninteger. 
Therefore J is a free abelian group of rank exactly 2, as asserted. 

Certainly J can then be generated as an ideal by two elements, and our cus- 
tomary notation has been to write J = (r1, 72) in this case. However, without an 
extra condition on them, the two ideal generators need not together be a Z basis 
for I because they need not generate all of J additively. It will be helpful to have 
separate notation when the generators are known to give a Z basis. Accordingly 
we shall write 7 = (rj,r2) when rj, rz give a Z basis of 7. In this case it will 
be helpful also to regard the set {r;, r2} as ordered with r; preceding rz, and we 
shall often do so. 

Now suppose that J = (r1, 72) is a nonzero ideal, and consider the expression 


r2 o(r2) 


rjo(r2) — o(r1)r2 = det é a) ; 


If J is written in terms of a second ordered Z basis as J = (s,, 52), then the two 
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ordered bases are related by a matrix & _) in GL(2, Z), the relationship being 
ri\ (a B S| 
m2) \y 8 sy) 
« o(ri)\ _(o@ aie o(s1) 
r2 o(r2) y 8} \s2 o(s2) )’ 


ry o(ri)\ _ 5; o(8}) 
act (" ae sea 


where +1 is the determinant of (; Ay Consequently the expression 


Hence 


and therefore 


Irjo(r2) — o(71)ro| 


|VD | 


where D is the field discriminant of K , is independent of the choice of Z basis. 
It is called the norm of the ideal J. The factor of VD in the denominator is a 
normalization factor that arranges for the norm of the ideal J = R to be 1; in fact, 
we can write R = (1, 5) with 6 as in the first paragraph of this section, and then 


ND = 


|/m+./im | Fee 
Ne = er |/4m | if m = 2 or 3 mod 4 = 
- L4./m)—+(—/m 3 ae? 
VD | be a 1 ifm =1mod4 


Since the norm of an element of R is given by N(r) = ro(r), it is immediate 
from the definition that 


N(r1l) =|N@)|NV) forr € R. 
Consequently the norm of the principal ideal (r) is given by 
N((r)) = |IN@IN(R) = |N(*)L=|N@)| — forre R. 


Still with J = (r1, 72), let us observe that 


o(ryo(r2) — o(r)r2) = —(rio V2) — o(r1)r2). 
It follows that 
; real if m > 0, 
ro (r2) — o(r1)r2 is ee 
imaginary ifm <0. 
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Since r10 (r2) — o (r1)r2 changes sign when r and rz are interchanged, let us say 
that the expression J = (rj, 72) for J is positively oriented if rio (r2) — o(r1)ra 
is positive or positive imaginary,’ negatively oriented if rjo(r2) — o(r;)rz is 
negative or negative imaginary. If 7 = (r,, rz), then exactly one of the expressions 
I = (r,,1r2) and I = (r2,7r,) is positively oriented. The notion of orientation will 
be critical to setting up the correspondence between classes of ideals and classes 
of forms. 

The set of nonzero ideals of R has a commutative associative multiplication 
that was introduced in Basic Algebra: if I and J are nonzero ideals, then JJ is 
defined to be the set of sums of products from the two ideals, the product J 
again being an ideal. Later in this section we shall recall some properties of this 
multiplication that were proved in Basic Algebra. 

We define two equivalence relations on the set of nonzero ideals of 7. We say 
that J and J are equivalent if there exist nonzeror ands in R with (r)/ = (s)J. 
Here (r) and (s) are understood to be principal ideals. The ideals 7 and J are 
strictly equivalent, or narrowly equivalent, if equivalence occurs and if r and 
s can be chosen with N(rs~') > 0. Both relations are certainly reflexive and 
symmetric. To see transitivity, let (7))f, = (r2)Jo and (52)fp = (83)/5. Then 
(r182)1, = (r252) lo = (r253)43, and J, is equivalent to J;. If also N(nirs') > 0 
and N(s253') > 0, then the product N((r182)(r283)~') is positive, and J; is 
strictly equivalent to 13. In other words, “equivalent” and “strictly equivalent” 
are equivalence relations. 

The principal ideals form one full equivalence class under “equivalent.” First 
of all, (7) is equivalent to (s) because (s)(r) = (rs) = (r)(s). In the reverse 
direction, if J and (1) are equivalent, let (*)/ = (s). Then there exists x € J with 
rx = s. Hence sr~! is in J, and (sr!) C J. In fact, equality holds: if y is in J, 
then the equality ry = sz with z in R says that y = (sr~')z, and y is in (sr—!). 
In other words, J = (sr7!). 

In a sense, therefore, equivalence of ideals measures the extent to which 
nonprincipal ideals exist. 

Multiplication is a class property of ideals relative to equivalence and to 
strict equivalence. In fact, if (r)J = (r’)I' and (s)J = (s’)J’, then (rs)IJ = 
(r's’)I'J’, and the assertion follows. 

The theorem will be that multiplication of strict equivalence classes of ideals 
of R makes the set of such classes into an abelian group that is isomorphic to the 
finite abelian form class group of discriminant D. This result is not as beautiful as 
one might hope, since the identity class of ideals under strict equivalence need not 
match the set of all principal ideals. However, we can quantify the discrepancy. 
The relevant result is as follows. 


°lf m <0, we adopt the convention that ./m is positive imaginary. 
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Proposition 1.17. Equivalence and strict equivalence are the same for ideals 
of R if and only if either 
(a) m > 0 and the fundamental unit ¢; has N(e,) = —1 or 
(b) m <0. 
In the contrary case when m > 0 and the fundamental unit e; has N(e) = +1,a 
nonzero principal ideal (7) is strictly equivalent to (1) if and only if N(r) > 0; 
in particular, the principal ideal (,/m ) is not strictly equivalent to (1). 


REMARKS. When m > 0, there are examples with N (e,) = +1 and examples 
with N(é;) = —1. Specifically when m = 2, ¢; = 1+ J/2, and this has 
N(e1) = —1. When m > 0 and m has any odd prime divisor p with p = 
3 mod 4, then N(¢,) = +1; in fact, otherwise ¢ = x + y./m would imply that 
—1 = N(e,) = x*—my” and therefore that —1 = x? mod p, but this congruence 
has no solutions by Theorem 1.2a. 


PROOF. Suppose thatm > Oand N(¢,) = —1. If (r)/ = (s)J with Ni(rs7!) < 
0, then (eyr)I = (s)J with N(eyrs~!) > 0. Thus equivalence implies strict 
equivalence in this case. 

Suppose that m < 0. Then all norms of nonzero elements are > 0. Hence 
N(rs~') > 0 is an empty condition, and equivalence implies strict equivalence. 

Conversely suppose thatm > 0 and N(¢,) = +1. Proposition 1.16 shows that 
the most general unit ise = te}, and consequently N(e) = N(+1)N(e))" = +1 
for every unit. The element ./m is in R, and N(./m) = —m < 0. We know 
that the principal ideals (1) and (./m ) are equivalent. Arguing by contradiction, 
suppose that they are strictly equivalent. Then (r) = (r)(1) = (s)(./m) = 
(s./m ) for some r and s with N(rs~') > 0. Since the principal ideals generated 
byr and s,/m are the same, these elements must be related byr = es./m for some 
unit ¢. Then N(rs~!) = N(e./m) = N(e)N(./m) = —m < 0, contradiction. 
The proposition follows. 


Once we have introduced group structures on the set of equivalence classes of 
ideals and the set of strict equivalence classes of ideals, it follows that the map that 
carries a strict equivalence class to the equivalence class containing it is a group 
homomorphism onto. If either of the conditions (a) and (b) in Proposition 1.17 
is satisfied, then this homomorphism is one-one. Otherwise its kernel consists of 
the two strict equivalence classes of principal ideals —those whose generator has 
positive norm and those whose generator has negative norm. 

At this point we could establish that the set of strict equivalence classes of ideals 
is a finite abelian group. The finiteness of the set of strict equivalence classes 
could be established directly by a geometric argument we give in Chapter V, 
and the group structure could be derived from the group structure on the set of 
“fractional ideals” of K that were introduced in Problems 48-53 at the end of 
Chapter VIII of Basic Algebra. 
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Although we could proceed with proofs along these lines, it is instructive to 
proceed in a different way. Rather than give a stand-alone proof of the finiteness 
of the number of strict equivalence classes of ideals, we prefer to derive this 
finiteness as part of the correspondence with proper equivalence classes of binary 
quadratic forms, since the number of such classes of binary quadratic forms has 
already been proved to be finite in Theorem 1.6a. The group structure then readily 
follows from this finiteness and the fact that R is a Dedekind domain. 

Let us pause for a moment, therefore, to use results we already know in order 
to show how the group structure on the set of strict equivalence classes follows 
once it is known that there are only finitely many such classes. We know from 
Theorems 8.54 and 8.55 of Basic Algebra that R is a Dedekind domain and that 
R has unique factorization for its nonzero ideals. In other words, in terms of the 
already-defined multiplication of ideals, each nonzero ideal J in R is of the form 
I= 1 be Pi" , where the P; are distinct nonzero prime ideals, the n; are positive 
integers, and k is > 0; moreover, this product expansion is unique up to the order 
of the factors. 


Lemma 1.18. Let 1 be the set of strict equivalence classes of nonzero ideals 
in R, with its inherited commutative associative multiplication. If 1 is finite, 
then 1 is a group under this multiplication. 


REMARKS. The group 7 will be seen in Theorem 1.20 to be isomorphic to the 
form class group of D. The set of ordinary equivalence classes is a quotient and 
is called the ideal class group of K. It will be generalized in Chapter V. 


PROOF. The identity element of 7 is the strict equivalence class of the ideal 
R = (1), and we are to prove the existence of inverses. Thus let J be given. For 
the sequence of ideals /, / 2. 1°,..., the finiteness of H shows that two of these 
ideals must be strictly equivalent. Suppose that J‘ is equivalent to /**! for some 
k > Oand/ > 0. Then there exist nonzero principal ideals (7) and (s) such that 
(r)1* = (s)I**". The uniqueness of factorization of ideals implies that we can 
cancel I* from both sides of this equality, thereby obtaining (r) = (s)I'. Let us 
define an element ft in R. If N(rs~!) > 0, we take t to be 1. Otherwise m must 
be positive, and we let t = ./m, so that N(t) < 0. In both cases we then have 
(rt)(1) = (s)(t)I' with N(rts~!) > 0, and the ideal (t) J! is strictly equivalent 
to (1). Hence the strict equivalence class of (t)/ ‘—! is an inverse to the strict 
equivalence class of J, and 7 is a group. 


Now we define the mappings F and Z that we shall use to establish the main 
result of this section. Let J be a nonzero ideal in R, and suppose that / is given 
by an expression J = (rj, rz) that is positively oriented. We regard x and y as 
integer variables. To J, we associate the binary quadratic form 


FU,1r1,r2) = NI)! N(rx +roy) = NI)! Ox +roy)(o(r))x +a(r2)y). 
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The associated 2-by-2 matrix for this form is 


1 ( 2r\o (r}) a) 
N() \rio(r2) +r20(r1) 2r20 (r2) 


=a? ee 2) 
N() \r2 o(72) r| a 


and the discriminant of the quadratic form is therefore 


L ry o(ry) o(r}) o(r2)\)_ 2 2 
20a « sea r ro )J=xe (rio(r2) — o(r1)r2) 


(rio (r2) — o(ri)ra) 
Iria (r2) — o(ri)ro|” 


= |D|(sgnm) = D. 


=|D 


Thus we have associated a quadratic form F(/,7r1, 72) of discriminant D to an 
ideal J when / is given by a positively oriented expression J = (rj,r2). If 
m < 0, this quadratic form is positive definite because the coefficient of x7, 
namely N(I)~!rjo(r,}) = N()7!N(1q), is positive when m < 0. 

In the reverse direction we associate to an arbitrary form (a,b,c) of dis- 
criminant D an ideal J = Z(a, b,c) given by a positively oriented expression 
(r1,r2). To begin with, if b is an integer with b = D mod 2, let us define b’ 
to be sb if D = 0 mod 4 and to be $(b — 1) if D = 1 mod 4; in other words, 
b= $(b — Tr(6)) in both cases. The definition of Z is to be 
(a, b’ +8) ifa > 0, 

(da, 6(b' + 4)) ifa <0. 

The right sides in the above display make sense as ideals if the angular brackets 
are replaced by parentheses. To see that the definitions make sense, we thus need 
to check that (a, b’ + 6) = (a, b' + 6) for all a and that the orientations are 


positive. Lemma 1.19a below shows that (a, b’ + 6) = (a, b’ + 6) if it is proved 
that a divides N(b’ + 5), and the computation that verifies this equality is 


N(b' +5) =b? +.b'(5 +.0(8)) + 60(5) 


Ta, b,c) =| 


oe if D =1 mod 4, 

— (b? =m if D =0 mod 4, 

-| i(b-1)°+56-)4+40—-D) ifD=1mod4, 
ib? — 1D if D =0 mod 4, 
1p2 

= 40° — D) 
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From the definitions near the beginning of this section, the orientation of (1, r2) 
is given by the sign of (/m)~ (rio (r2) — o(ri)r2). Thus 


orientation(a, b’ + 6) = sgn ((V/m )~'a(o (5) — 5)) = sgna, 
orientation (5a, 5(b’ + 5)) = sgn ((/m ) | (Sao (5b’ + 5”) — o (5)a5(b’ + 8))) 
= sgn ((/m )~'N(6)a(o(5) — 6)) = —sgna, 


and the orientations are positive in both cases. 


Lemma 1.19. 


(a) If a # 0 and D’ are integers such that a divides N(b’ + 5) in Z, then 
(a, b' +6) = (a, b’ +6) in the sense that the free abelian subgroup of R generated 
by a and b’ + 6 coincides with the ideal generated by a and b’ + 6. 

(b) If J is any nonzero ideal in R, then / is of the form 7 = (a,r) for some 
integer a > 0 and some r in R. 


PROOF. For (a), we are to show that J’ = Za + Z(b’ + 8) is closed under 
multiplication by the generators 1 and 6 of R. Closure of 7’ under multiplication 
by | is evident, and the formula 6a = —b’a + a(b’ + 5) shows that 6(Za) C I’. 
Addition of 6b’ to the sum of the two formulas 6* = 6(5 + 0(6)) — da (6) = 
6 Tr(6) — N(5) and N(b' + 8) = b? +b’ Tr(6) + N(6) yields 


6(b' + 8) = —N(b' + 8) + (b' + Tr(5))(D’ + 8), 


which shows that 6(b’ + 5) € I’ because N(b’ + 4) is by assumption an integer 
multiple of a. 

For (b), we start from any Z basis {r1, 72} of J, say with ry; = a; + b16 and 
r2 = a2 + bod, and let d = GCD(b;, bz). Choose integers n; and nz with 
nb, + nob2 = d. Then GCD(nj, n2) = 1, and we can therefore find integers 
k, and kz with det ‘é ) = 1. Consequently () = G e) ) is anew Z 

ny ng 52 ny nz r2 
basis of J of the form 
Sy, = Cyr kdé, 


So =o+dé. 


If we put a = s; — ks2 and possibly replace a by its negative, then {a, s2} isa Z 
basis of J of the required form. 


Theorem 1.20. The set 7 of strict equivalence classes of nonzero ideals 
relative to the field K = Q(,/m ) is a finite abelian group. Moreover, the mapping 
F that carries a positively oriented expression J] = (rj), 72) for a nonzero ideal 
of R to a binary quadratic form depends only on J, not the ordered Z basis, and 
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descends to an isomorphism of the group 7{ onto the form class group H for 
the discriminant D of the field K,, i.e., the group of proper equivalence classes of 
binary quadratic forms of discriminant D, subject to the remark below. Moreover, 
the mapping Z with domain all binary quadratic forms whose discriminant equals 
the field discriminant of K , sending such a form to a positively oriented expression 
for a nonzero ideal of R, descends to be defined from H to 1, and the descended 
map is the two-sided inverse of the isomorphism induced by F. 


REMARK. If m < 0, H is understood as usual to include only the classes of 
the positive definite forms. 


PROOF. The proof proceeds in six steps. 


Step 1. We show that the proper equivalence class of the quadratic form 
F(U,1r1, 12) depends only on the ideal J, not the positively oriented expression 
I = (r,,r2) for it. Thus the class of the form can be abbreviated as F(/). 

Suppose that J = (s1, 52) is another positively oriented expression for 7. Then 


we can write () — & 4 (::) for a matrix i. ?) in GL(2, Z), and we have 


rro(i)\ _ (aB 5, o(s1) 
C ee = C a C a (*) 


det (" 2) =tahidey fe ) 


r2 o(r2) 52 0 (52) 


seen that 


and that 


where +1 is the determinant of ay Since both expressions J = (rj, 72) and 
I = (51, 82) are positively oriented, it follows that the sign in the determinant 
equation is plus, hence that 6; 4 ) is in SL(Q, Z). Substituting from (*) into the 
formula for the matrix associated to the binary quadratic form FU,711,7r2), we 
obtain the matrix 


-1(%B 51 o(s1) o(s1) o(s2) ay 
NY) © -) @ oo) ( S182 ) C : (2) 
The product of the coefficient N )—! and the middle two matrices is the matrix 


associated to the quadratic form F(/, 51, 52), and (**) therefore exhibits the two 
quadratic forms as properly equivalent. 


Step 2. We show that the proper equivalence class F(/) does not change when 
we replace J by a strictly equivalent ideal. 

Thus let J = (71,72) and J = (51,82) be expressions for 7 and J, and 
suppose that (r) and (s) are nonzero principal ideals such that (r)J = (s)J 
and N(s/r) > 0. The formula 


donc 203 aror det 7) = Noyce ) 


rry o(rr2) ry o(r2) r2 o(r2) 
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shows that the expression (r)/ = (rr1,rr2) is positively oriented if N(r) > 0 
and is negatively oriented if N(r) < 0. Similarly (s)J = (ss1, ssz) is positively 
oriented if N(s) > O and is negatively oriented if N(s) < 0. Since N(r/s) > 0, 
N(r) and N(s) are both positive or both negative. Possibly replacing r and s by 
r./m and s./m, we may assume that N(r) and N(s) are both positive. Then the 
matrix associated to the quadratic form F((r)/, rr1, rr) is 


—1 (rr, o(rr)) o(rr1) o(rr2) 
N(rl) es i) ( rr) rr2 ) 
= -l1(riom) r 0 o(r) 0\ ( 71) o(r2) 
= Nl) é oe) ¢ a) ( 0 )) ror 


SNC ( 223) (Oe a) 


ry r 
= Ine ic Nyy (oes) (ee) 
= wey (283) (TP), 
while the matrix associated to F((s) J, 551, 552), by a similar computation, is 


Na)! ‘e a) ‘ee ) ; 


$2 0(S2) Sy $2 


Since (r)I = (s)J, Step 1 shows that F((r)/, 71, rr2) is properly equivalent to 
F((s) J, S51, 552). 

Step 3. We show that Z(a, b, c) depends only on the proper equivalence class 
of the binary quadratic form (a, b, c). 

Problem 37 at the end of Chapter VII of Basic Algebra shows that SL(2, Z) 


0 1 11 
|)» hence by aB = Ce 


a7! = ‘<, i) Thus it is enough to handle wf and a!. 


is generated by a = @ ee) and Bp = (_ ) ang 


11 
01 


(a,b+2a, *). Define b’ = 5(b — Tr(6)) in the same way as when Z was defined. 
Ifa > 0, then Z(a, b,c) = (a, b' +54), and Z(a, b+ 2a, *) = (a, (b+2a)' +58) = 
(a, b'+a+5); thus the two image ideals are the same. Ifa < 0, then the respective 
images are (5)(a, b’ + 5) and (8)(a, b' +a + 4), and again the image ideals are 
the same. 

To handle a7! = ic ; ; 


Z(c, —b, a) are strictly equivalent. We saw just after the definition of Z that 
N(b' + 5) = ac. There are four cases to the proof of the strict equivalence 
according to the signs of a and c. Let us use the symbol ~ to denote “is strictly 
equivalent to.” 


The operation of a6 = ( ) on forms sends (a,b,c) into the translate 


we are to show that the ideals Z(a, b,c) and 
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Suppose that a > 0 and c > 0, so that N(b' + 6) > 0. Then 


L(a, b,c) = (a, b' +6) ~ (b' +4(8))(a, b! + 8) = (a(b’ +4 (5), NB’ + 8) 
= (a(b' + o(8)), ac) = (a)(b' + o (8), c) 
~ (c,b' +.0(8)) = (c, —b' — 0 (8)) = (c, (—b)' + 8), 


the last equality holding because b’ + (—b)' = — Tré = —é — o (5). The right 
side equals Z(c, —b, a), and the strict equivalence is proved in this case. 
Suppose that a < 0 and c < 0, so that N(b’ + 6) > 0. Then 


Ta, b,c) = (6)(a, b’ + 8) ~ (b' +0(6)) (8) (a, BD’ + 8) 
= (5)(a(b' + o(6)), N(’ +. 4)) = (6) ah’ + o(8)), ac) 
= (a)(5)(b' + o(5),c) ~ (8)(c, b' + 0 (6)) 
= (5)(c, —b' — o(5)) = (8)(c, (—by’ + 8) = L(c, —b, a), 
and the strict equivalence is proved in this case. 
Suppose that a > 0 andc < 0, so that N(b’ + 6) < 0. Then N(5)N(b’ + 8) 
is positive, and 
T(a, b,c) = (a, b' +8) ~ (6)(b' +0(6))(a, b’ +8) 
= (8)(a(h’ + 0 (6)), N(b' + 8)) = 6)’ + o(8)), ac) 
= (a)(5)(b' + 0 (5), c) ~ (5)(c, b' + 7 (8)) = (8)(c, —b — (5) 
= (8)(c, (—by’ + 6) = Tc, —b, a), 
and the strict equivalence is proved in this case. 
Suppose that a < 0 andc > 0,so that N(b’ +6) < 0. Then N(5)~!N(b’ +8) 
is positive, and 
Ta, b,c) = (6)(a, b' +5) ~ (b' +0 (8))(a, b’ +8) 
= (a(b’ + o(8)), N(b’ + 8))=(aW' + 4 (5)), ac) =(a)(b' + 4 (5), c) 
~ (c, —b' — 6 (8)) = (c, (—by' + 6) = Tc, —b, a), 


and the strict equivalence is proved in this case. 


Step 4. We show that the mapping of the set H of proper equivalence classes 
of forms to itself induced by FZ is the identity. 

Let the given form be (a, b,c). With b’ defined to be S(b — Tr(d)) as usual, 
we have seen that V(b’ + 6) = ac. Therefore a divides N(b’ + 5), and Lemma 
1.19a shows that (a, b‘'+ 6) = (a, b' + 6) in the sense that the ideal generated by 
a and b’ + 6 matches the free abelian group generated by these two elements. 
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First suppose that a > 0. Then Z(a, b,c) = (a, b' + 5) = (a, b’ +5), and we 
know that this expression is positively oriented. Calculation gives 


-1 a a 
N(1) =|VD| [det (54s b'4.0(6) ) | 


=al|VD| |o(8) —8)| 


| /m | /|/m | if D = 1 mod 4, 
aa if D =0 mod 4, 
=. (1) 


Therefore the quadratic form F Z(a, b, c) is 
N(1)"'(ax + (b' +. 8)y)(ax + (b' +.0(5))y) 
=a! (a°x* +.a(2b’ + (6 +0 (5)))xy + N(b' +8)y’) 


= ax? + (2b' + Tr(5))xy + cy? 


= ax’ + bxy + cy’, 


and we see that FZ(a, b,c) = (a,b,c) whena > 0. 

Next suppose that a < 0. Then Z(a, b,c) = (6a, 6(b' +5)) = (6a, 5(b’ + 5)), 
and we know that this expression is positively oriented. Since a < 0 cannot occur 
for m <0, N(6) is negative. Thus calculation gives 


N(1) = N((S)(a, b’ + 8)) = N((8)(—a, b’ + 8)) = |N(5)|N (=a, b’ + 8)) 
= |N(6)|la| = N(6)a, 


the next-to-last equality following from the calculation that gives (+). Therefore 
the quadratic form FZ(a, b, c) is 


N(I)'(adx + (b' + 8)y)(ao(5)x + (b' + 0 (5))o (Dy) 
= N(1)'N()(ax + (b+ d)y)(ax + ' +.0(5))y) 
=a! (a°x* +.a(2b’ + (6 +0 (5)))xy + N(b' +8)y’) 
= ax? + (2b' + Tr(5))xy +. cy? 


= ax’ + bxy + cy’, 


and we see that FZ(a, b,c) = (a,b,c) whena < 0. 
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Step 5. We show that the mapping of the set 1 of strict equivalence classes of 
ideals to itself induced by Z F is the identity. In view of Step 4, it follows that F 
and Z are both one-one onto. Since Theorem 1.6a shows H to be finite, 1 has to 
be finite, and Lemma 1.18 shows that the multiplication on 1 makes H into an 
abelian group. 


Let an ideal J be given, and apply Lemma 1.19b to write J = (a, r) witha > 0 
an integer. The expression deciding orientation is do (r) —o (a)r = a(o(r)—r), 
and this is multiplied by —1 ifr is replaced by —r. Possibly changing r to —r in 
the expression for J, we may therefore assume that the expression J = (a, r) is 
positively oriented. Write r = c + dé. Then 


eo)—r=de®)—9= |e” itmaimoas f= 


The orientation of J is given by @(a(r) —r) = adVD, and we deduce that d > 0 
and that 
NW) =|VD|lalo(r) —r| = ad. 


The definition of F gives F(I, ad, r) = N(I)~'N(a@x-+ry), which is a quadratic 
form whose x? coefficient is a = N(I)~'a@* = d7~'@ and whose xy coefficient is 


b= N(1)'@Tr(r) = d7! Tr(r) = d7!(2c + dTr(8)) = 2d7!c + Tr(6). 


With b’ defined as usual to be b’ = S(b — Tr(5)), we see that b’ = do!c. 
Consequently ZFU,a,r) = (a,b' + 8) = (d7'a, d~'c + 8). The product of 
this ideal with (d) is @,c + d5) = G,r) = I, and thus Z F(/, a, r) is strictly 
equivalent to J. 

Step 6. We show that the mapping induced by Z from the set H of proper 
equivalence classes of forms to the set 7 of strict equivalence classes of ideals 
respects the group operations in H and 1 and hence is an isomorphism. 

Let two proper equivalence classes of forms with discriminant D be given, 
and use Theorem 1.12a to choose representatives (a,b,c) and (a,b, C) with 
GCD(a, a) = 1. The composition of the forms is well defined and is (aa, b, *) 
for a suitable third entry in Z. Let b’ be $(b — Tr(6)) as usual. We divide matters 
into cases according to the signs of a and a. 

Suppose that a > 0 and a@ > 0. The definition of Z shows that the ideals 
corresponding to the three quadratic forms in question are 


(a,b'+6), (,b'+8), and (aa,b'+5). 


The product of the first two ideals is (aa, a(b' +8), a(b’ +4), (b'4 8)°), and we 
are to show that this equals (aa, b’ + 5). In fact, the inclusion 


(ad, a(b' + 5), @(b' + 8), (b' + 8)*) S (aa, b’ + 8) 
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is clear. For the reverse inclusion we use the fact that GCD(a, a) = 1 to write 
kia + koa = 1 for suitable integers kj and ky. Then we see that b’ + 56 = 
k (a(b! + 8)) + ko (a(b’ + 8)), and the reverse inclusion follows. 

Suppose that a and @ are of opposite sign. By symmetry we may assume that 
a > Oandd <0. The three ideals are then 


(a,b' +8), (a5, (b' +8)8), and (aaé, (b' + 6)8), 


while the product of the first two ideals is (aa6, a(b’+5)5, a(b'+5)6, (b'+6)78) = 
(5)(aa, a(b’ + 8),a(b’ + 8), (b! + ae From the previous paragraph this last 
ideal equals (5) (aa, b’ + 5) = (aaé, (b’ + 6)d), and we have the required match. 

Suppose that a < 0 anda < 0. This time the product ideal is given by 
(a5, (b' + 8)5)(G5, (b’ + 6)5) = (8°)(aa, a(b! + 4), a(b’ + 8), (b' + 5)*) = 
(5?) (ad, b’ + 5), the second equality following from the computation in the 
paragraph for a and d both positive. The ideal (67) (aa, b’ +4) is strictly equivalent 
to (aa, b' + 5) because N(57) = N(6)? is positive. Thus we have the required 
match on the level of strict equivalence classes. We conclude that the mapping 
of H to 1 is a group isomorphism. 


8. Primes in the Progressions 4n + 1 and 4n + 3 


This section is the first of three sections about Dirichlet’s Theorem on primes in 
arithmetic progressions, whose statement is as follows. 


Theorem 1.21 (Dirichlet’s Theorem). If m and b are relatively prime integers 
with m > 0, then there exist infinitely many primes of the form km + b withka 
positive integer. 


We begin with the earlier treatment of the arithmetic progressions 4n + 1 and 
4n + 3 by Euler. In 1737 Euler made the stunning discovery of the formula 
el 


Dia eee 


— ps 
n=1 p prime P 


valid for s > 1. Actually, the formula is valid for complex s with Res > 1, but 
Euler had not considered powers n* with s complex by this time and did not need 
them for his purpose. Euler’s formula is a consequence of unique factorization 
of integers. In fact, the product for p < N is 


: 1 

=|] (145 SSE oT -)= Sa 

sox | D<N n with ft 
no prime 


divisors > N 
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Letting N — oo, we obtain the desired formula. 

Built into the formula is the result of Euclid’s that there are infinitely many 
primes, i.e., infinitely many primes in the arithmetic progression n. There are 
two ways to see this. In both cases one starts from the observation that the sum 
Sar 1/n’ is > ye) dx = 1/(s — 1), from which it follows that the sum 
tends to infinity as s decreases to 1. In one case the argument continues with the 
observation that if there were only finitely many primes, then] |, prime op would 
certainly have finite limit as s decreases to 1, and we arrive at a contradiction. 
In the other case the argument continues with the observation that the logarithm 
of a is comparable in size to 1/p*, hence that log °°? ; 1/n* is comparable 
to Ds, prime !/p*. Since ><, L/n* tends to infinity, Se prime 1/p* must tend to 
infinity, and we conclude that there are infinitely many primes. We shall return 
to this observation shortly in order to justify it more rigorously.!° 

Euclid’s proof was much simpler: if there were only finitely many primes, 
then the sum of 1 and the product of all the primes would be divisible by none of 
the primes and would give a contradiction. The difficulty with Euclid’s argument 
is that there is no apparent way to adapt it to treat primes of the form 4n + 1. 
Euler’s argument, by contrast, does adapt to treat primes 4n + 1. 

Before continuing, let us make rigorous the notion of comparing sizes of factors 
of an infinite product with terms of an infinite series. An infinite product []>-, cn 
with c, € C and with no factor 0 is said to converge if the sequence of partial 
products converges to a finite limit and the limit is not 0. A necessary condition 
for convergence is that c, tend to 1. 


Proposition 1.22. If |a,| < 1 for all n, then the following conditions are 
equivalent: 


(a) [[°2,C + lanl) converges, 


(b) >°°°, |a,| converges, 


(c) []72,C. — lanl) converges. 
In this case, []° | (1 + a,) converges. 


PROOF. Condition (c) is equivalent to 
(c’) T]72, — lanl)~! converges. 


For each of (a), (b), and (c’), convergence is equivalent to boundedness above. 


Since 
N N 


N 
14+ ¥ lanl < TT + lan) < TT ty, 
n=1 


n=1 n=1 


'0Tn fact, this argument is showing that }7 1/p diverges, which says something more than just 
that there are infinitely many primes. 
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we see that (c’) implies (a) and that (a) implies (b). To see that (b) implies (c’), 
we may assume, without loss of generality, that |a,| < 5 for all n. Since |x| < 5 
implies that 


1 d 1 1 
log + <lx| sup [Slog 4|=Ix1 sup (74) <2lal, 
It|<|x1<3 It|<Ix1<3 


we have 
a. & N ; N 
log ( IT a) — d, log (4) = 2 2 lan. 


Thus (b) implies (c’). 
Now suppose that (a) holds. To prove that iar (1+a,) converges, itis enough 
to show that [ee mu(i + an) tends to 1 as M and N tend to oo. In the expression 


N 
I] qd + an) —1 
n=M 


> 


we expand out the product, move the absolute values in for each term, and 
reassemble the product. The result is the inequality 


N 
< J] G+ la,|) -1. 
n=M 


N 
I] G@+a@,)-1 
n=M 


By (a), the right side tends to 0 as M and N tend to oo. Therefore so does the 
left side. This proves the proposition. 


Using this proposition and its proof, we can give a more rigorous justification 
for the comparison of log )°°°_, n~* and > p prime P * in Euler’s argument. An- 
ticipating the notation that Riemann was to use for the function a century later, 


we introduce 
oo] 
f(s) = eB Ae 


at the moment just for real s with s > 1. (This function subsequently was 
named the Riemann zeta function and is defined and analytic for complex s 
with Res > 1. We postpone a more serious discussion of ¢(s) to Proposition 
1.24 below.) We begin from the formula 


log ¢(s) = e log 5 = > (4 ap pF vo), 


p prime p prime 


Let us see that this expression equals 


os a + bounded term ass | 1. 


p prime 
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Going over the second displayed line in the proof of Proposition 1.22, which 
applied when |x| < 4, we have 


log + — x| < |x| sup | 4 (log 4 - 1)| 


Insbls5 
1 t 2 
= |x| sup | —1/=|x| sup [4] <2\x/. 
It}<|x1<5 It|<Ix1<3 


For x = p * with s > 1, this inequality becomes 


1 1 


sk: —2s 
log elem Be 2p. 


Consequently 


logs) — Fo gel=| 2 [oe rp — Fl 


p prime p prime 
1 1 —2s 
= 2 |log = - F ea ae ee 
Pp prime p prime 


The right side is < 2 baat n~? for alls > 1,and we arrive at the desired formula 


1 
log ¢(s) = = — + bounded term ass | 1. 


p prime 


Since we know that log f(s) increases without bound as s decreases to 1, we 
can immediately conclude that there are infinitely many primes in the arithmetic 
progression n. 

With this argument well understood as a prototype, let us modify it to treat 
primes 4k + 1 separately from primes 4k + 3. Euler needed one further key idea 
to succeed. It is tempting to replace the sum over all primes of p~* in the above 
argument by 


1 1 
eee ee 
p prime, p p prime, 2 
p=!1mod4 p=3 mod4 


trace backward, and see what happens. What happens is that the expansion of 
the corresponding product of (1 — p~*)~! as a sum does not yield anything very 
manageable. For example, with the first of the two sums, we are led to the 
logarithm of the series paar c(n)n~“*, where c(n) is 1 if n is a product of primes 
4k + 1 and is 0 otherwise, and we have no direct way of deciding whether this 
diverges or converges as s decreases to 1. 
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Euler’s key additional idea was to work with the sum and difference of the 
displayed series, rather than the two terms separately, and then to recover the two 
displayed series at the end. Let us see what this idea accomplishes. Tracing back- 
ward in the derivation of the formula log ¢(s) = > pte p * + bounded term, 


we want to obtain a series }> app * from the logarithm of a product 


p prime 
I] p(l—a@p p~*)~' and be able to recognize this product as equal to a manageable 
series )--° , b,n~*. Guided by what happens for ¢(s), we can hope that b,, will be 
readily computable from the a,’s and the unique factorization of n. The relevant 


identities, which we shall verify below, are as follows: 


1 1 
dea = ALS s? 


n odd p prime, = p* 
p odd 
eo SC Ce 
n odd n p prime, 1S Be p prime, Lor p* 
p=4k+1 p=4k+3 


In more detail let us write 


Gy= 0 ifn=0O mod 2, 
XO =) 1 ifn =1 mod 2, 
0 ifn=O0O mod 2, 
nin =| 1 ifn=1 mod4, 
—1 ifn=3 mod4. 


With x equal to xo or x1, we have x(mn) = x(m)x(n) for all m and n. 
1 
(-1)2@-) 


Consequently the two expressions )>, jag + and) gaa’ as are both of 
the form a 
x(n) 
Lis. =). 
n=1 


the function x being xo for the first series and being x; for the second series. As 
we shall verify rigorously in the next section, the same argument via unique 


factorization that yields Euler’s identity )°~ n° = err oe gives a 
factorization 
x(n) Il 1 
fey Oe 
a=1 nm p prime 1— X(P)P ‘ 


because of the identity x(n) = x(m)x(n). Going over the argument that 


log ¢(s) is the sum of ) 7, prime P* and a bounded term, we find that 


log Lis,x)= > XP) 5 965, x) 


p prime 
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with g(s, x) bounded as s | 1. The sum and difference for the two choices of 
x(n) gives 


1 
log(L(s, x) Ls, x1)) =2 D> = + (8(s, x0) + 86s, x1) 
pape! 
and 


44 I 
log(L(s, xo) L(8, x1) =2 Yo — + (g(s, X0) — 80s, x). 
pask+3 


The function L(s, xo) is the product of ¢(s) and an elementary factor. In fact, 
a change of index of summation in the formula defining ¢(s) gives 2~°¢(s) = 
Yh even 7? *- Subtracting this formula from the definition of ¢(s) gives 


1 
L(s, x0) = DY = = (= 2*)56). 


n odd 


Therefore 
Tee (Ss Xo) = +o. 
AY 


‘ : _yh@-p , ; 
Meanwhile, the series L(s, x1) = >>), oaa io is alternating and converges 


for s > 0 by the Leibniz test. The convergence is uniform on compact sets, and 
the sum L(s, x1) is continuous for s > 0. Grouping the terms of this series in 
pairs, we see that L(1, x1) is positive.!! Hence we have 


0< Tees x1) < +00. 
AY 


Putting together the two limit relations for L(s, xo) and L(s, x1) as s decreases 
to 1, we see that 


log (L(s, xo)L(s.x1)) and — log (L(s, xo) LCs, x1) ') 


both tend to +oo as s | 1. Referring to the values computed above for these 
expressions and taking into account that }°1/p exceeds }>1/p* when s > 1, 


we see that 
1 1 
) — and ) — 
Pp prime P p prime P 
p=4k+1 p=4k+3 


'lWe can even recognize the value of L(1, x1) as 2/4 from the Taylor series of arctan x, but the 
explicit value is not needed in the argument. 
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are both infinite. Hence there are infinitely many primes 4k + 1, and there are 
infinitely many primes 4k + 3. 

The proof of the general case of Dirichlet’s Theorem (Theorem 1.21) will 
proceed in similar fashion. We return to it in Section 10 after a brief but systematic 
investigation of the kinds of series and products that we have encountered in the 
present section. 


9. Dirichlet Series and Euler Products 


A series par a,n * with a, and s complex is called a Dirichlet series. The 
first result below shows that the region of convergence and the region of absolute 
convergence for such a series are each right half-planes in C unless they are equal 
to the empty set or to all of C. These half-planes may not be the same: for 
example, °° ,(—1)"n7~° is convergent for Res > 0 and absolutely convergent 
for Res > 1. 


Proposition 1.23. Let )°°° , a,n~* be a Dirichlet series. 


(a) If the series is convergent for s = so, then it is convergent uniformly on 
compact sets for Re s > Reso, and the sum of the series is analytic in this region. 

(b) If the series is absolutely convergent for s = so, then it is uniformly 
absolutely convergent for Res > Re So. 

(c) If the series is convergent for s = so, then it is absolutely convergent for 
Res > Reso +1. 

(d) If the series is convergent at some so and sums to 0 in a right half-plane, 
then all the coefficients are 0. 


REMARK. The proof of (a) will use the summation by parts formula. Namely 
if {u,} and {v,} are sequences and if U, = aia uz forn >0,thenl]<M<WN 
implies 


N N-1 
ys UnVn = y Un (Un — Unti) + Unun — Uy-ivm. (*) 
n=M n=M 

PROOF. For (a), we write a,n~* = a,n~ -n~&~) = uyUy and then apply the 
summation by parts formula (*«). The given convergence means that the sequence 
{U,,} is convergent, and certainly v, tends to 0 uniformly on any proper half-plane 
of Res > Resg. Thus the second and third terms on the right side of («) tend 
to 0 with the required uniformity as M and N tend to oo. For the first term, the 
sequence {U,,} is bounded, and we shall show that 


oo oo i ; 
YS lun — nil = do m0 (+10 
n=1 n=1 
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is convergent uniformly on compact sets for which Res > Reso. Use of («) and 
the Cauchy criterion will complete the proof of convergence. Forn <t <n+1, 


we have 


|n—S—80) an t~ 8-90) | < sup | FS) = t—s—s0))| 
1 


nst<n+ 
sa S—SO |s—so| 
7 sup p sot! —_ n'+Re(s—sg) ‘ 
n<t<n+l 
Thus 
_ —(s—s —(s—si |s—so| 
[Un — Ungil = |n (S80) — (n+ 1) 6 0) | < pitRes—s9) ? 


and pas [Un — Un+1| is uniformly convergent on compact sets with Res > Reso, 
by the Weierstrass M-test. It follows that the given Dirichlet series is uniformly 
convergent on compact sets for which Res > Re sg. Since each term is analytic 
in this region, the sum is analytic. 

For (b), we have 


an 
no 


Since the sum of the right side is convergent, the desired uniform convergence 
follows from the Weierstrass M-test. 
For (c), let € > 0 be given. Then 


nsoti+e 


with the first factor on the right bounded and the second factor contributing to a 
finite sum. Therefore we have absolute convergence at sp + 1+ €, and (c) follows 
from (b). 

For (d), we may assume by (c) that there is absolute convergence at so. Suppose 


that aj = --- = ay_; = 0. By (b), °° a,n~* = 0 for Res > Reso. The 
series 
(oe) 
d) an(n/N)* (4x) 
n=N 


is by assumption absolutely convergent at sy), and Re s > Re so implies 
|an(n/N)~*| < |an(n/N) | . 
By dominated convergence we can take the limit of (>) term by termas s + +00. 


The only term that survives is ay. Since (**) has sum 0 for all s, we conclude 
that ay = 0. This completes the proof. 
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Proposition 1.24. The Riemann zeta function ¢(s) = per n°, initially 
defined and analytic for Res > 1, extends to be meromorphic for Res > 0. Its 
only pole is at s = 1, and the pole is simple. 


REMARK. Actually, ¢(s) extends to be meromorphic in C with no additional 
poles, but we do not need this additional information. 


PROOF. For Res > 1, we have 
1 = (~e-sdt= St pnt —s iq 
sal, tedt= Vf te dt. 
n=1 
Thus Res > 1 implies 
1 = 1 n+l —s 1 cee n+l, _s —s 
c= G+ (s-f t d)=4+0S, (n-* —t~*)dt. 
n=1 n=1 


It is enough to show that the series on the right side converges uniformly on 
compact sets for Res > 0. Thus suppose that Res > o > 0 and |s| < C. The 
proof of Proposition 1.23a showed that |n~* — t~°| < |s|n~“+R©), Hence 


Pears —esyat] s fl wes = 8] dt s [sn OFR) < Coto), 


Since )°°° ,n~“"*°) < oo, the desired uniform convergence follows from the 
Weierstrass M-test. 


Proposition 1.25. Let Z(s) = Peat) a,n * be a Dirichlet series with all 
ad, > 0. Suppose that the series is convergent in some half-plane and that the sum 
extends to be analytic for Res > 0. Then the series converges for Res > 0. 


PROOF. By assumption the series converges somewhere, and therefore so = 
inf {s > 0 | Sat ayn * converges} is a well-defined real number > 0. Arguing 
by contradiction, suppose that sy > 0. Since }°a,n~* converges uniformly on 
compact sets for Res > so by Proposition 1.23a and since the terms of the series 
are analytic, we can compute the derivatives of the series term by term. Thus 


(oe) 
ZO oy 4+ 1) = yo eCren" (*) 


nor! 
n=l 


The Taylor series of Z(s) about so + 1 is 


Z(s)= ss mls — 80 — YYZ) (59 + 1) 
N=0 
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and is convergent at s = 5505 since Z(s) is analytic in the open disk centered at 
so + 1 and having radius so + 1. Thus 


(oe) 
Z(550) = Yo yl + 550)8(—DZ (50 + D), 
N=0 
with the series convergent. Substituting from (*), we have 
Z(160) = 5° a an(logey” (1 4 Ly). 
N=0 n= 


This is a series with terms > 0, and Fubini’s Theorem allows us to interchange 
the order of summation and obtain 


vAC = Sy en so)” aa (logn)(1+450) — — 4350 
Go)=>v dv ot =D re => Ann 2°, 
n=1 N=0 am 


In other words, the assumption sp > 0 led to a point between 0 and so (namely $50) 
for which there is convergence. This contradiction proves that so = 0. Therefore 
ear a,n * converges for Res > 0. 


We shall now examine special features of Dirichlet series that allow the 
series to Haye product expansions like the one for ¢(s), namely °°, n* = 


Ls pie = . Consider a formal product 


[I] G+app* +---+anp-™ +---). 


p prime 


If this product is expanded without regard to convergence, the result is the Dirichlet 


series poate 1; Ann *, where a, = | and a, is given by 
— & 2:4 j —— ry eee Tk 
An = Ay Ayr ifn = p, Ph: 


Suppose that the Dirichlet series pent ayn ~* is in fact absolutely convergent in 
some right half-plane. Then every rearrangement is absolutely convergent to the 
same sum, and the same conclusion is valid for subseries. If E is a finite set of 
primes and if N(£) denotes the set of positive integers requiring only members 
of E for their factorization, then we have 


I d + app ae + Apmp-™ +-.-)= Ps a,n~. 
pee neN(E) 


Letting E swell to the whole set of positive integers, we see that the infinite 
product has a limit in the half-plane of absolute convergence of the Dirichlet 
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series, and the limit of the infinite product equals the sum of the series. The sum 
of the series is 0 only if one of the factors on the left side is 0. In particular, the 
sum of the series cannot be identically 0, by Proposition 1.23d. Thus the limit of 
the infinite product can can be given by only this one Dirichlet series. 

Conversely if an absolutely convergent Dirichlet series )°°° , a,n~* has the 
property that its coefficients are multiplicative, i.c., 


aj=1 and Ann =Anan Whenever GCD(m,n) = 1, 


then we can form the above infinite product and recover the given series by ex- 
panding the product and using the formula a, = Apr ++ Ayre whenn = pj! --- p;'. 
In this case we say that the Dirichlet series )~° , ann~* has the infinite product 
as an Euler product. Many functions in elementary number theory give rise 
to multiplicative sequences; an example is a, = y(n), where ¢ is the Euler @ 
function. 

If the coefficients are strictly multiplicative, i.c., if 


a,j=1 and Ann = Aman for allm andn, 


then the p" factor of the infinite product simplifies to 


2 1 
aaa | aap 


l+app *+---+(@p,p *)” 


As a consequence we obtain the following proposition. 


Proposition 1.26. If the coefficients of the Dirichlet series )°°° , a,n~* are 
strictly multiplicative, then the Dirichlet series has an Euler product of the form 


ee) 1 
ye 7] 


p prime oe is a 
valid in its region of absolute convergence. 


REMARK. We refer to the kind of Euler product in this proposition as a first- 
degree Euler product. 


This is what happens with ¢(s), for which all the coefficients are 1, and with 
An = Xo(n) and ad, = x1() as in the previous section. Conversely an Euler 
product expansion of the form in the proposition forces the coefficients of the 
Dirichlet series to be strictly multiplicative. 

A Dirichlet series } ewe ~, @,n~* with |a,| < n° for some real c is absolutely 
convergent for Res > c+ 1. This fact leads us to a convergence criterion for 
first-degree Euler products. 
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Proposition 1.27. A first-degree Euler product [](1 — asp)" with 
lap| < p* for some real c and all primes p defines an absolutely convergent 
Dirichlet series for Res > c + 1 and hence a valid identity °°, a,n~* = 


ae prime é) a aap ye in that region. 


PROOF. The coefficients a, are strictly multiplicative, and thus |a,| < n° for 
all n. The absolute convergence follows. 


10. Dirichlet’s Theorem on Primes in Arithmetic Progressions 


In this section we shall prove Dirichlet’s Theorem as stated in Theorem 1.21. 
Recall from Section 8 that the proof of Dirichlet’s Theorem for the progressions 
4n + 1 and 4n + 3 required taking the sum and difference of two expressions, 
working with them, and then passing back to the original expressions. Generaliz- 
ing this step involves recognizing this process as Fourier analysis on the 2-element 
group (Z/4Z)*. This kind of Fourier analysis was discussed in Section VII.4 
of Basic Algebra. Let us begin by reviewing what is needed from that section 
of Basic Algebra and then pinpoint the Fourier analysis that was the key to the 
argument in Section 8. 

Let G be a finite abelian group, such as (Z/mZ)*. A multiplicative character 
of G is ahomomorphism of G into the circle group S' C C*. The multiplicative 
characters of G form a finite abelian group G under pointwise multiplication: 


(xx'V(g) = x(g)x'(g). 


In this setting we recall the statement of the Fourier inversion formula. 


THEOREM 7.17 OF Basic Algebra (Fourier inversion formula). Let G be a 
finite abelian group, and introduce an inner product on the complex vector space 
C(G, C) of all functions from G to C by the formula 


(F, F’) = >> F(g)F(). 


geG 


the corresponding norm being || F'|| = (F', F) 1/2 Then the members of G form an 
orthogonal basis of C(G, C), each x in G satisfying || x ||? = |G|. Consequently 
|G| = |G|, and any function F : G — C is given by the “sum of its Fourier 


series”: a, 
Fe) = aq L(Y Fx) x0). 


xeG heG 
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EXAMPLE. With the two-element group G = {+1}, there are two multiplicative 
characters, with xyo9(+1) = xo(—1) = 1, x1(4+1) = 1, and x;(—1) = —1. We 
can think of the Fourier-coefficient mapping as carrying any complex-valued 
function F on G to the function F on G given by F(x) = one F(h)x(h). 
The inversion formula says that F is recovered as F = 1(F (x0) xo + F(x) m1). 
A basis for the 2-dimensional space of complex-valued functions on G consists 
of the two functions F*+ and F~, with F* equal to 1 at +1 and 0 at —1 and 
with F~ equal to 0 at +1 and 1 at —1. The multiplicative characters are given 
by xo = F*+ + F™ and x; = Ft — F-. For these two functions the inversion 
formula reads F* = 5 (Xo + x1) and F7 = 5 (Xo — x1). In Section 8 the roles of 
Ft and F~ are played by functions of s, not by scalars, with F + corresponding 
to }),=1moa4 P° and F~ corresponding to >? 3 moaa P- We are to consider 
the functions of s corresponding to their sum xo and to their difference x;. The 
results of Section 9 show that these are the series that come from Euler products. 
The role of the Fourier inversion formula is to ensure that we can reconstruct 
Vip=imods P* aNd Y? 3 moa4 P * from the sum and difference. The general 
proof of Dirichlet’s Theorem is a direct generalization of this argument form = 4. 


Fix an integer m > 1. A Dirichlet character modulo m is a function 
x :Z— S'U {0} such that 


G) x(j) = Oif and only if GCD(j, m) > 1, 
(ii) x (j) depends only on the residue class j mod m, 
(iii) when regarded as a function on the residue classes modulo m, x is a 
multiplicative character of (Z/mZ)*. 


In particular, a Dirichlet character modulo m determines a multiplicative character 
of (Z/mZ)*. Conversely each multiplicative character of (Z/mZ)* defines a 
unique Dirichlet character modulo m as the lift of the multiplicative character on 
the set {7 € Z | GCD(j,m) = 1} and as 0 on the rest of Z. For example the 
multiplicative character on (Z/4Z)* that is 1 at 1 mod 4 and is —1 at 3 mod 4 
lifts to the Dirichlet character that is 1 at integers congruent to 1 modulo 4, 
is —1 at integers congruent to 3 modulo 4, and is 0 at even integers. It will 
often be notationally helpful to use the same symbol for the Dirichlet character 
and the multiplicative character of (Z/mZ)*. Because of this correspondence, 
the number of Dirichlet characters modulo m matches the order of G for G = 
(Z/mZ)*, which matches the order of G and is y(m), where ¢ is the Euler g 
function. The principal Dirichlet character modulo m, denoted by xo, is the one 
built from the trivial character of (Z/mZ)”: 


1 if GCD(j,m) = 1, 


xo) = to if GCD(j, m) > 1. 
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Each Dirichlet character modulo m is strictly multiplicative, in the sense of 
the previous section. We assemble each as the coefficients of a Dirichlet series, 
the associated Dirichlet L function, by the definition 


ee) 


LS) es 


s 
n=1 


Proposition 1.28. Fix m, and let x be a Dirichlet character modulo m. 


(a) The Dirichlet series L(s, x) is absolutely convergent for Res > 1 and is 
given in that region by a first-degree Euler product 


Lexy = |] 


Br ee 40) ale 


(b) If x is not principal, then the series for L(s, x) is convergent for Res > 0, 
and the sum is analytic for Res > 0. 

(c) For the principal Dirichlet character x9 modulo m, L(s, xo) extends to be 
meromorphic for Res > 0. Its only pole for Res > 0 is at s = 1, and the pole is 
simple. It is given in terms of the Riemann zeta function by 


Lis, x0) =o) JT] G-p™). 
p prime, 
p dividing m 

PROOF. For (a), the boundedness of x implies that the series is absolutely 
convergent for Res > 1. Since x is strictly multiplicative, L(s, x) has a first- 
degree Euler product by Proposition 1.26, and the product is convergent in the 
same region. 

For (b), let us notice that y 4 xo implies the equality 


Ys x+b)=0 for any b, (x) 
n=1 


since the member of (Z/mZ)* that corresponds to x is orthogonal to the trivial 
character, by the Fourier inversion formula as quoted above from Basic Algebra. 
For s real and positive, let us write 


x) = x(n)- + = UnvUn 
in the notation of the summation by parts formula that follows the statement of 
Proposition 1.23, and let us put U,, = et ux. Equation (*) implies that {U,,} 
is bounded, say with |U,,| < C. Summation by parts then gives 


NI 
1 1 Cy C _ 2C 
< » C (2 cur) T ows | ws — ms: 
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This expression tends to 0 as M and N tend to oo. Therefore the series L(s, x) = 
4 a) is convergent for s real and positive. By Proposition 1.23a the series 
is convergent for Re s > O, and the sum is analytic in this region. 

For (c), let Res > 1. From the product formula in (a) with x set equal to xo, 


we have 


Lis,x) = I] L_., 


: I-p* 
p prime, 
p not dividing m 


Using the Euler product expansion of ¢(s), we obtain the displayed formula of (c). 
The remaining statements in (c) follow from Proposition 1.24, since the product 
over primes p not dividing m is a finite product. 


By Proposition 1.28b, L(s, x) is well defined and finite at s = 1 if x is not 
principal. The main step in the proof of Dirichlet’s Theorem is the following 
lemma. 


Lemma 1.29. L(1, x) # Oif x is not principal. 


PROOF. Let Z(s) = ITs. L(s, x). Exactly one factor of Z(s) has a pole at 
s = 1, according to Proposition 1.28. If any factor has a zero at s = 1, then Z(s) 
is analytic for Res > 0. Assuming that Z(s) is indeed analytic, we shall derive 
a contradiction. 

Being the finite product of absolutely convergent Dirichlet series for Res > 1, 
Z(s) is given by an absolutely convergent Dirichlet series. We shall prove that 
the coefficients of this series are > 0. More precisely we shall prove for Res > 1 


that 
1 


Z(s) = (*) 


a g(p)’ 
pwith GCD(p,m)=1 (1 — p~ Fs) 


where f (p) is the order of p in (Z/mZ)* and where g(p) = y(m)/f (p), v being 
Euler’s g function. The factor (1 — p~/)~! is given by a Dirichlet series with 
all coefficients > 0. Hence so is the g(p)" power, and so is the product over p 
of the result. Thus () will prove that all coefficients of Z(s) are > 0. 

To prove (*), we write, forRes > 1, 


1 1 
a= [eae Ve cag) = Uh Megs) 
GCD(p,m)=1 


Fix p not dividing m. We shall show that 


I] (1 = x(p)p *) = (1 = pt)? ; (2) 


x 
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where f is the order of p in (Z/mZ)* and where g = g(m)/f; then (*) will 
follow. 

The function xy — x(p) is ahomomorphism of (Z/mZ)”* into the subgroup 
{e27k/T} of §! and is onto some cyclic subgroup {e27/*/f'} with f’ dividing 
f. Let us see that f’ = f. In fact, if f’ < f, then p/ # 1 modm, while 
x(p!) = x(p)" = | for all x; since x(pt) = x(1) for all x, the x’s cannot 
span all functions on (Z/mZ)* , in contradiction to the Fourier inversion formula 
(Theorem 7.17 of Basic Algebra). 

Thus x — x(p) is onto {e?7'*//}. In other words, x (p) takes on all f" roots 
of unity as values, and the homomorphism property ensures that each is taken on 
the same number of times, namely g = y(m)/f times. If X is an indeterminate, 
we then have 


fel 
[[d-x@x= ([] a -e*/xy) =(1—-Xf)8, 
x k=0 


Then (**) follows and so does (*). Hence all the coefficients of the Dirichlet 
series of Z(s) are > 0. We have already observed that this series, as the finite 
product of absolutely convergent series for Res > 1,is absolutely convergent for 
Res > 1. Thus Proposition 1.25 applies and shows that the Dirichlet series of 
Z(s) converges for Res > 0. 

Since the coefficients of the series are positive, the convergence is absolute 
for s real and positive. By Proposition 1.23b the convergence is absolute for 
Res > 0. Therefore the Euler product expansion («) is valid for Res > 0. 

For primes p not dividing m and for real s > 0, we have 


1 


pat pf + pfs 4...8 > 1+ ple + pes 4... 
(=p) 
1 
= —y(m)s —29(m)s 4. 
Ey +p [— peas 


In combination with (), this inequality gives 


zo{ I] y=) 


p dividing m 
(dll eee ee) 
p with GCD(p,m)=1 dl i pe p dividing m I- poe 
ee) 


1 1 
IT Ta pave = 2 em 


p prime n= 


The sum on the right is +00 for s = 1/g(m), while the left side is finite for that 
s. This contradiction completes the proof of the lemma. 
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PROOF OF THEOREM 1.21. First we show for each Dirichlet character x modulo 
m that 
log Lis,x)= DY *P + 80s,x) (*) 
p prime 
for real numbers s > 1, with g(s, x) remaining bounded as s | 1. In this 
statement we have not yet specified a branch of the logarithm, and we shall 
choose it presently. Fix p and define, for s > 1, a value of the logarithm of the 
p" factor of the Euler product of L(s, x) in Proposition 1.28a by 


1 — x@) 1, 1x) 1 1x@*) — xp) 
0S Gegg e oe eee ee Se) aD 
In Section 8 we obtained the inequality | log(1 — x)! —x| < 2|x/? for real x 
with |x| < 5s but the proof remains valid for complex x with |x| < s. Since 


x = x(p)p * is complex with |x(p)p~*| < 5, we obtain 


Iss. P.O = log (Ips) — XP | S 2AIx(P)P TP < 2p. 


Since )7, prime PAS ie © Ss Ga, thersenies >) 8s, p, x) is uniformly 
convergent for s > 1. Let g(s, x) be the continuous function ae a(s, p, X). 
Summing (+) over primes p, we obtain 


Y log (Sp) =e xp) + g(s, xX). 
p P 


Because of the validity of the Euler product expansion of L(s, x) in Proposition 
1.28a, the left side represents a branch of log L(s, x). This proves (x). 
For each b prime to m, define a function F; on the positive integers by 


1 ifn =bmodm, 
on) = t otherwise. 


The Fourier inversion formula (Theorem 7.17 of Basic Algebra) gives 


> x(b)x(n) = gm) Fy(n). (H) 
x 


Multiplying (*) by x (b), summing on x, and using (+) to handle the term that is 
summed over p prime, we obtain 


gm) YY ps = > x(b)log L(s, x) — x O)g(s, x). (HH) 
Xx x 


p prime, 

p=km-+b 
The term an x (b)g(s, x) is bounded as s | 1, according to («). The term 
xo(b) log L(s, xo) is unbounded as s | 1, by Proposition 1.28c. For x nonprin- 
cipal, the term x (b) log L(s, x) is bounded as s | 1, by Proposition 1.28b and 
Lemma 1.29. Therefore the left side of (+) is unbounded as s | 1. Hence the 
number of primes contributing to the sum is infinite. 
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11. Problems 


Fix an odd integer m > 1. Let P be the set of odd primes p > 0 such that 
x? = m mod p is solvable and such that p does not divide m. Show that P is 
nonempty and that there is a finite set S of arithmetic progressions such that the 
members of P are the odd primes > 0 that lie in at least one member of S. 


Let D be a nonsquare integer, and let m be an odd integer with GCD(D, m) = 1. 

By suitably adapting the proof of Theorem 1.6, 

(a) prove that if m is primitively representable by some binary quadratic form 
of discriminant D, then x2 = D mod m is solvable, 

(b) prove that if x7 = D mod m is solvable and m is odd, then m is primitively 
representable by some binary quadratic form of discriminant D. 


For a fixed discriminant D, let H be the group of proper equivalence classes 
of binary quadratic forms of discriminant D, and let H’ be the set of ordinary 
equivalence classes of discriminant D. Inclusion of a proper equivalence class 
into the ordinary equivalence class that contains it gives a map f of H onto H’. 
Give an example in which H’ can admit no group structure for which f is a group 
homomorphism. 


(a) Show that if (a, b, c) has order 3 in the form class group, then the product 
of any two integers of the form ax* + bxy + cy? is again of that form. 

(b) Show that h(—23) = 3. 

(c) Using the general theory, show that the class of 2x? + xy + 3y? has order 3. 

(d) Find an explicit formula for (X, Y) in terms of (x1, y1) and (x2, y2) such 
that (2x? + x1y1 + 3y7)(2x5 + xay2 + 3y5) = 2X7 4+ XY +3Y?. 


If two integer forms are improperly equivalent over Z, prove that they are properly 
equivalent over Q. 


Verify for the fundamental discriminant D = —67 that h(D) = 1. (Edu- 

cational note: It is known that the only negative fundamental discriminants 

D with h(D) = 1 are —3, —4, —7, —8, —11, —19, —43, —67, —163. It is 

known also that the only other nonsquare D < 0 for which h(D) = 1 are 
12, —16, —28, —27.) 


This problem carries out the algorithm suggested by Theorem 1.8 to find repre- 
sentatives of all proper equivalence classes of binary quadratic forms (a, b, c) of 
discriminant 316 = 4-79. For each of these, b will be even. 

(a) Foreacheven positive b withb < /4-79, factor (b? — 4-79) /4 as a product 
ac in all possible ways such that a > 0 and such that both |a| and |c| lie 
between /79 — b/2 and V/79 + b/2, obtaining 16 forms (a, b,c). Expand 
the list by adjoining each form (—a, b, —c), so that the expanded list has 32 
members. 
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(b) Arrange the 32 members of the expanded list of (a) into 6 cycles, obtaining 
2 cycles of length 4 and 4 cycles of length 6. 
(c) Conclude that h(4- 79) = 6. 


8. For discriminant D = —47, the class number is h(—47) = 5, and the reduced 
binary quadratic forms are (1, 1, 12), (2, 1, 6), (2, —1, 6), (3, 1,4), G, -1,4). 
Show what the multiplication table is for the proper equivalence classes of these 
forms. 


Problems 9-11 concern the Jacobi symbol, which is a generalization of the Legendre 
symbol. Let m and n be integers with n > 0 odd, and letn = pi ..» p* be the prime 
factorization of n. The Jacobi symbol (“) is defined to be 0 if GCD(m, n) > 1 and 
is defined to be ITj- iG. mh if GCD(m, n) = 1, where (= ae is a Legendre symbol. The 
Jacobi symbol ieee extends the domain of the Legendre symbol, and it depends 
only on the residue m mod n. Even when GCD(m, n) = 1, the Jacobi symbol does 
not encode whether m is a square modulo n, however, since fe i) = +1 and since the 
residue —1 is not a square modulo 21. 


9. Suppose that n and n’ are odd positive integers and that m and m’ are integers. 
Verify that 
@) Cir) = GGr)» 


(b) (%) = (4) = 1 if GCD(m, n) = 1, 
10. Prove for all odd positive integers n that 
@ @=cp2?, 
o) Q=Cyeer. 
11. (Quadratic reciprocity) Prove 1 all odd positive integers m and n satisfying 
GCD(m, n) = 1 that () = (-D2z z0m—DIL5 (nD) (2). 


Problems 12-13 indicate, without spelling out what the group G is, two uses of 
Dirichlet’s Theorem in the subject of “elliptic curves?’ No knowledge of the subject 
of elliptic curves is assumed, however. 


12. Suppose that G is a finite abelian group whose order |G| divides p + | for all 

sufficiently large primes p with p = 3 mod 4. It is to be shown that |G| divides 

4 by means of multiple applications of Dirichlet’s Theorem. 

(a) Deduce that 8 does not divide |G| by considering the arithmetic progression 
8k + 3. 

(b) Deduce that 3 does not divide |G| by considering the arithmetic progression 
12k +7. 

(c) Deduce that no odd prime g > 3 divides |G| by considering the arithmetic 
progression 4qk + 3. 


13: 
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Suppose that G is a finite abelian group whose order |G| divides p + | for all 

sufficiently large primes p with p = 2 mod 3. It is to be shown that |G| divides 

6 by means of multiple applications of Dirichlet’s Theorem. 

(a) Deduce that 4 does not divide |G| by considering the arithmetic progression 
12k +5. 

(b) Deduce that 9 does not divide |G| by considering the arithmetic progression 
9k + 2. 

(c) Deduce that no odd prime g > 3 divides |G| by considering the arithmetic 
progression 3qk + 2. 


Problems 14-19 develop some elementary properties of ideals and their norms in 
quadratic number fields. Notation is as in Sections 6-7. In particular, the number 
field is K = Q(./m), the ring R of algebraic integers in it has Z basis {1, 5}, and o 
is the nontrivial automorphism of K fixing Q. 


14. 


15. 


16. 


17. 


18. 


19. 


Prove that if J = (a,r) is anonzero ideal in R witha € Zandr € R, thena 
divides N(s) for every s in I. 


Prove that any nonzero ideal J in R can be written as J = (a,b + gd) witha, 
b, and g in Z and with a > 0,0 < b < a, andO < g <a. Prove also that the 
Z basis with these properties is unique, and it has the properties that g divides a 
and b and that ag divides N(b + g6). 

Let a, b, and g be integers satisfying a > 0,0 < b < a,and0 <g <a 
with g dividing a and b and with ag dividing N(b + gd). Prove that the ideal 
I = (a,b + g6) in R has {a,b + gd} as a Z basis. 

Prove that if J = (a,r) isanonzero idealin R witha € Z,r € R,andr =c+dé 
for integers c and d, then N(J) = |ad|. 


(a) Prove that if J is a nonzero ideal in R, then N (J) is the number of elements 
in R/I. 

(b) Deduce that if 7 C J are nonzero ideals in R, then N(J) divides N (J), and 
I= J if andonly if N(J) = NW). 

(a) Using the Chinese Remainder Theorem, prove that if J and J are nonzero 
ideals in R with + J = R, then NU J) = N(I)N(J). 

(b) Let P be a nonzero prime ideal in R, and let p > O be the prime number 
such that PM Z = (p)Z. Then R/P is a vector space over Z/pZ, and its 
order is of the form p/ for some integer f > 0. Show by induction on the 
integer e > 0 that R/P° has order p°’. 

(c) Using unique factorization of ideals, deduce that if J and J are any two 
nonzero ideals in R, then NU J) = N(I)N(J). 

(d) Prove that any nonzero ideal J of R has Jo (1) = (N(J)). 


Problems 20-24 concern the splitting of prime ideals when extended to quadratic 
number fields. Fix a quadratic number field Q(./m ), and let R, D, 5, and o be as 
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in Sections 6-7. Let p > 0 bea prime in Z. According to Theorem 9.62 of Basic 

Algebra, the unique factorization of the ideal (p)R in R is one of the following: 

(p)R = (p) is already prime in R, (p)R = P, P2 is the product of two distinct prime 

ideals, or (p)R = P? is the square of a prime ideal. 

20. Deduce from the formula N((p)R) = p that if P is a nontrivial factor in the 
unique factorization of the ideal (p)R, then N(P) = p. 


21. This problem concerns the prime p = 2. 

(a) Use Problem 15 to prove that if D = 5 mod 8, then (2)R is a prime ideal 
in R. 

(b) Prove thatif D = 1 mod 8, then (2)R factors into the product of two distinct 
prime factors as (2)R = (2, 6)(2,1+). 

(c) Prove that if Dis even and D/4 = 3 mod 4, then (2)R = (2, 1+6)? exhibits 
(2)R as the square of a prime ideal. 

(d) Prove that if D is even and D/4 = 2 mod 4, then (2)R = (2, 5)” exhibits 
(2)R as the square of a prime ideal. 


22. Let p be an odd prime. 

(a) Prove that if D is odd, then (p)R has a nontrivial factorization into prime 
ideals if and only if x7 + x + zd — D) = 0 mod p has a solution, and in 
this case a factorization of (p)R is as (p)R = (p,x + 6)(p,x +0(6)). 

(b) Prove that if D is even, then (p)R has a nontrivial factorization into prime 
ideals if and only if x7 = 0 mod (D/4) has a solution, and in this case a 
factorization of (p)R is as (p)R = (p,x + 6)(p,x +0(8)). 

(c) Deduce from (a) and (b) that (p)R has a nontrivial factorization into prime 
ideals if and only if D is a square modulo p. 


23. Let p be an odd prime such that D is a square modulo p, so that Problem 22c 
gives a nontrivial factorization of (p)R into prime ideals of the form (p)R = 
(p,x + 6)(p, x + 0(6)) for some integer x. Let J = (p,x +). 

(a) Prove that if D is odd, then o (J) = J if and only if the integer x is 5(p —1). 
(b) Prove that if D is even, then o (J) = / if and only if the integer x is 0. 


24. Let p be an odd prime such that D is a square modulo p, so that Problem 22c 
gives a nontrivial factorization of (p)R into prime ideals of the form (p)R = 
(p, x + 5)(p, x + 0(6)) for some integer x. Using the previous problem, show 
that the two factors on the right are the same ideal if and only if p divides D. 


Problems 25-29 seek to identify the genus group explicitly for fundamental discrim- 
inants D. Let K = Q(./m) be the corresponding quadratic number field, let R be 
the ring of algebraic integers in K,, and let o be the nontrivial automorphism of K 
fixing Q. Let E = {pi,..., Pe+1} with g > O be the set of distinct prime divisors 
of D. The goal of this set of problems is to prove that the order of the genus group 
is 28 and to exhibit ideals in R representing each genus. Recall from Theorem 1.20 
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that strict equivalence classes of ideals correspond to proper equivalence classes of 
binary quadratic forms and therefore that each genus corresponds to a set of proper 
equivalence classes of binary quadratic forms. 


25% 


26. 


27. 


28. 


Letthe form class group H for discriminant D be isomorphic to a product of cyclic 
groups of orders DRA ccs OR rhe ene where k,,...,k, and l,,...,/, are 
positive integers and qj, ..., qs are odd primes that are not necessarily distinct. 
Prove that the genus group has order 2” and is abstractly isomorphic to the 
subgroup of H of elements whose order divides 2. (Educational note: Thus a 
goal of the present set of problems is to show that r = g.) 


According to Problems 20-24, the nonzero prime ideals of R are of three kinds: 
(1) unique distinct ideals J = (p, b+6) ando (J) = (p, b+o (6)) with prod- 
uct (p)R if p is an odd prime not dividing D such that x? = D mod p 
is solvable, or if p = 2 and D = 1 mod 8, 
(ii) the ideal (p)R if p is an odd prime not dividing D such that x7 = 
D mod p is not solvable, or if p = 2 and D =5 mod 8, 
(iii) a unique ideal 7, with ie = (p)R if p divides D. 

For each subset S C E of the g + 1 distinct prime divisors of D, define J, = 

Tilges Ip. 

(a) Using unique factorization of ideals in R, show that any nonzero proper ideal 
I in R with o (J) = J is of the form (a) Js for some a € Z and some subset 
SCE, 

(b) By considering norms of ideals, show that J uniquely determines S in (a). 

(a) The element x = —1 of K has N(x) = 1 and factors as x = o(y)y! for 


the element y = ./m of K. For all other elements x of K with norm 1, 
verify the formula 


lto(x) (ltoa(x))x x+xo(x) 1+x 


1+.x (1 +x)x 2 (1l+.x)x _— A+x)x _ 


and explain why it shows that x is of the form o (y)y~! for some y 4 Oin K. 
(Educational note: This result is a special case of Hilbert’s Theorem 90, 
which is a theorem in the cohomology of groups and appears in Chapter IT. 
The general theorem says for a finite Galois extension K/k with Galois 
group I that the cohomology H! of the group P with coefficients in the 
abelian group K * is 0.) 

(b) Show that the element y in (a) can be taken to be in R and that all such y’s 
in R are Z multiples of one of them yo, which is unique up to a factor of —1. 


Let J be a nonzero ideal in R whose class in the ideal class group 7/ has order 2, 
ie., an ideal J such that 1? = (x) for some element x € R. 
(a) Show that the element xN(/)~! of K has norm 1. 
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29. 
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(b) Show that the corresponding element yo of R from the previous problem has 
the property that o ((yo)Z) = (o)!. 

(c) Using either yo or yo./m from (b), deduce that for any nonzero ideal J in 
R with I? principal, there is a strictly equivalent ideal Js for some subset 
S C E ofthe g+1 prime divisors of EF. Consequently the order of the genus 
group is a power of 2 equal to at most 28*!. 


This problem shows that the number of ideals Js in the previous problem that 

are mutually strictly inequivalent is exactly 2°. To get at this fact, the problem 

investigates properties of principal ideals J = (x) in R with the properties that 

o(1) = I and N(x) > 0. Since o(/) = J, it must be true that o(x) = ex for 

some unit € in R, and then N(o(x)) = N(x) implies that N(e) = +1. Matters 

now split into cases along the lines of the hypotheses of Proposition 1.17. 

(a) Under the assumption that m < 0 and that m is neither — 1 nor —3, show that 
if a principal ideal J = (x) in R has o(/) = J, then x is in Z or in Z,/m. 

(b) Under the assumption that m < 0, show that the only subsets S of E for 
which the ideal Js is principal are S = © and S equal to the set of all 
prime divisors of m, i.e., S equal to E for D odd and for D even with 
D/4 = 2 mod 4 and S equal to E — {2} for D even with D/4 = 2 mod 4. 

(c) Under the assumption that m < 0, Proposition 1.17 says that strict equiv- 
alence for ideals coincides with equivalence. Show how to conclude from 
this fact and the results of (a) and (b) that the order of the genus group is 2% 
when m < 0. 

(d) Under the assumption that m > 0 and that the fundamental unit ¢; has norm 
—1, Proposition 1.17 says that strict equivalence for ideals coincides with 
equivalence. With J, x, and ¢ as in the statement of the problem, show that 
é= e2" for some integern > 0. Deduce that o (e{x) = se/x fora suitable 
choice of sign s , and show as a consequence that Js is principal for the same 
S’s as in (b) and that the order of the genus group is 28. 

(e) Under the assumption that m > 0 and that the fundamental unit ¢; has 
norm +1, Proposition 1.17 says that strict equivalence for ideals is distinct 
from equivalence; in particular, there are two strict equivalence classes of 
principal ideals: those with a generator of positive norm and those with a 
generator of negative norm. Let v5 and yg be the elements produced by 
Problem 27 that satisfy ¢) = o (yf (yf)! and —e1) = o (y9 (9 ) |. Prove 
that exactly one of Yo and y, has positive norm, so that two of the principal 
ideals (1), (yg ), Oo) (./m ) are strictly equivalent to (1), and two are not. 
Prove that all four of these principal ideals are of the form Js and that they 
are distinct. By expressing elements arising from Problem 27 for the most 
general unit in R in terms of yo and ¢1, show that no other Js is a principal 
ideal. Show as a consequence that the number of strict equivalence classes 
of ideals among the Js’s is 28. 
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Problems 30-34 show that proper equivalence over Q for two integer forms of 
fundamental discriminant D implies proper equivalence over Z/DZ. Consequently 
the order of the genus group is at most the number of classes of integer forms of 
discriminant D under proper equivalence over Z/DZ. It will follow from the next 
set of problems, concerning “genus characters,” that the number of such classes is at 
least 2, where g + | is the number of distinct prime divisors of D. In combination 
with Problem 29, this result shows that the number of genera equals 2%. Throughout 
this set of problems, let D be a fundamental discriminant. 


30. 


313 


32. 


33. 


Let (a1, bj, cj) be a binary quadratic form over Z of discriminant D. Using 
Lemma 1.10, prove that (a;, D1, c1) is properly equivalent over Z to a form 
(a, b, c) of discriminant D such that GCD(a, D) = 1. 


Suppose that (a, b,c) is a binary quadratic form over Z of discriminant D such 

that GCD(a, D) = 1. 

(a) Prove that if D is odd, then (a, b, c) is properly equivalent over Z to a form 
(a, kD, 1D) for some integers k and /. 

(b) Prove that if D is even, then (a, b, c) is properly equivalent over Z to a form 
(a, 2kD, —a(D/4) + 1D) for some integers k and /. 


Suppose that (a, kD,1D) is a form over Z having odd discriminant D, satisfying 
GCD(a, D) = 1, and taking on an integer value r relatively prime to D for some 
rational (x, y). Write x and y as fractions with a positive common denominator 
as small as possible: x = u/w and y = v/w. 

(a) Prove that GCD(w, D) = 1, and conclude that a = d?r mod D for some 
integer d relatively prime to D. 

(b) Suppose that (a’, k’D, I'D) is a second form over Z having discriminant D, 
satisfying GCD(a’, D) = 1, and taking on the value r at some rational point. 
Prove that a’ = as* mod D for some s relatively prime to D. 

(c) Suppose that (a, b,c) and (a’, b’,c’) are forms over Z of the same odd 
discriminant with GCD(a, D) = GCD(a’, D) = 1, and suppose that these 
forms are properly equivalent over Q. Deduce that (a, b, c) and (a’, b’, c’) 
are properly equivalent over Z/DZ in the sense that there exists a matrix 


(G ‘ in SL(2, Z/DZ) such that substitution of x = ax’ + By’ and y = 


yx’ + dy’ leads from ax? + bxy + cy? modulo D to a’x’? + b'x'y! + cly? 
modulo D. 


Suppose that (a, 2k D, —a(D/4)+I/D) is a form over Z having even discriminant 

D, satisfying GCD(a, D) = 1, and taking on an integer value r relatively prime 

to D for some rational (x, y). Write x and y as fractions with a positive common 

denominator as small as possible: x = u/w and y = v/w. 

(a) Prove that GCD(w, D) = 1, and obtain a congruence relating a and r 
modulo D. 
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(b) Suppose that (a’, 2k’D, —a'(D/4) + I'D) is a second form over Z hav- 
ing discriminant D, satisfying GCD(a’, D) = 1, and taking on the value 
r at some rational point. Prove that (5) = (<) for every odd prime p 
dividing D. 

(c) In the setting of (b), suppose in addition that D/4 = 3 mod 4. Prove that 
a =a’ mod4. 

(d) In the setting of (b), suppose in addition that D/4 = 2 mod 4. Prove for 
D/4 = 2 mod 8 that a’ = +a mod 8, and prove for D/4 = 6 mod 8 that 
either a’ = a mod 8 ora’ = 3a mod 8. 

(e) Suppose that (a,b,c) and (a’, b’,c’) are forms over Z of the same even 
discriminant with GCD(a, D) = GCD(a’, D) = 1, and suppose that these 
forms are properly equivalent over Q. Deduce that (a, b, c) and (a’, b’, c’) 
are properly equivalent over Z/DZ. 


34. Why does it follow from Problems 30-33 that the order of the genus group for 
discriminant D is at least as large as the number of proper equivalence classes 
under SL(2, Z/DZ) of integer forms of discriminant D? 


Problems 35—40 introduce “genus characters.” In fact, genus characters are already 
implicit in Problems 32 and 33. Throughout this set of problems, let D be a fun- 
damental discriminant, and suppose that D has exactly g + | distinct prime factors. 
The content of these problems will be summarized in Problem 40. Call two binary 
quadratic forms over Z of discriminant D similar modulo D if they take on the same 
residues r modulo D that are relatively prime to D. Proper equivalence over Z via 
SL(, Z) implies proper equivalence modulo D via SL(2, Z/DZ), and this in turn 
implies similarity modulo D in the sense that was just defined. Problems 30-31 show 
that it is enough to study forms ax? mod D for D odd, where GCD(a, D) = 1, and 
to study forms a(x? — (D/4)y?) for D even, again where GCD(a, D) = 1. Initially 
the genus characters are functions of pairs (similarity class, r), where r is a residue 
modulo D with GCD(r, D) = 1 such that r is represented by the form modulo D. 
The values of these functions are (5) for each odd prime p > 0 dividing D, as well 
as the indicated one of the following for p = 2 if D is even: 


g(r) = (2) = (-13"-) if D is even and D/4 = 3 mod 4, 
nr) = () = (—1)8-D if D is even and D/4 = 2 mod 8, 


r 


E(r)n(r) = (S) = (—1)2°-Dt+8@-D if D is even and D/4 = 6 mod 8. 


Thus g + | expressions have been defined for each ordered pair (similarity class, r). 


35. Using Problems 32 and 33, show that the genus characters are independent of the 
residue r modulo D with GCD(r, D) = 1 such that r is represented by the form 
modulo D. Therefore the residue a in the quadratic form, either ax? mod D for 
D odd or a(x” —(D /4) y?) for D even, can be used as r, and the genus characters 
are g + | functions defined on the set of similarity classes modulo D. 


36. 


37. 


38. 


39. 


40. 
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Prove that the genus characters respect the operation of multiplication of proper 
equivalence classes of forms over Z. 


The product of all g + 1 genus characters is | in every case. A sketch of the 
argument for D odd is as follows: Since D = 1 mod 4, D has an even number 
2t of prime factors 4k + 3. Use of the Jacobi symbol with a odd and p varying 
over the (odd) prime divisors of D gives 


Nn@= 1 @® 1 @=s@ 1 ® 1 ®=@), 

Pp p=4k+1 ~ p=4k+3 p=4k+1 —— p=4k+3 
and the right side is +1 by Problem 2a. Using this sketch as a guide, show that 
the product of all g + 1 genus characters is 1 for the cases that D is even and 
(a) D/4 =3 mod 4, 
(b) D/4 =2 mod 8, 
(c) D/4 =6 mod 8. 
If D is even, let a be € if D/4 = 3 mod4, n if D/4 = 2 mod 8, and én 
if D/4 = 6mod8. Let p + Sy, be any function to {+1} from the set of 
distinct prime divisors of D. Using Dirichlet’s Theorem on primes in arithmetic 
progressions, prove that there exists a prime g such that (2) = S$, for each odd 


prime divisor p of D and a(q) = sz in case D is even. 


With @ as in the previous problem, let p +> s, be any function to {1} from the 
set of distinct prime divisors of D such that |] p Sp = +1, and choose a prime 
q as in the previous problem. Prove that qg is primitively representable by some 
integer binary quadratic form of discriminant D and that the values of the genus 
characters on this form are the numbers s,. Conclude that the number of distinct 
similarity classes modulo D is at least 2°. 


For the quadratic number field K = Q(./m ) with discriminant D, suppose that 
D has g + 1 distinct prime divisors. Conclude that the following equivalence 
classes of binary quadratic forms over Z of discriminant D coincide and that the 
number of such classes is 28: 
(i) classes relative to proper equivalence over Q, i.e., genera, 
(ii) classes relative to proper equivalence over Z/DZ, 
(iii) classes relative to similarity modulo D. 


CHAPTER II 


Wedderburn—Artin Ring Theory 


Abstract. This chapter studies finite-dimensional associative division algebras, as well as other 
finite-dimensional associative algebras and closely related rings. The chapter is in two parts that 
overlap slightly in Section 6. The first part gives the structure theory of the rings in question, and 
the second part aims at understanding limitations imposed by the structure of a division ring. 

Section | briefly summarizes the structure theory for finite-dimensional (nonassociative) Lie 
algebras that was the primary historical motivation for structure theory in the associative case. All 
the algebras in this chapter except those explicitly called Lie algebras are understood to be associative. 

Section 2 introduces left semisimple rings, defined as rings R with identity such that the left 
R module R is semisimple. Wedderburn’s Theorem says that such a ring is the finite product of 
full matrix rings over division rings. The number of factors, the size of each matrix ring, and the 
isomorphism class of each division ring are uniquely determined. It follows that left semisimple 
and right semisimple are the same. If the ring is a finite-dimensional algebra over a field F’, then the 
various division rings are finite-dimensional division algebras over F’. The factors of semisimple 
rings are simple, i.e., are nonzero and have no nontrivial two-sided ideals, but an example is given 
to show that a simple ring need not be semisimple. Every finite-dimensional simple algebra is 
semisimple. 

Section 3 introduces chain conditions into the discussion as a useful generalization of finite 
dimensionality. A ring R with identity is left Artinian if the left ideals of the ring satisfy the 
descending chain condition. Artin’s Theorem for simple rings is that left Artinian is equivalent to 
semisimplicity, hence to the condition that the given ring be a full matrix ring over a division ring. 

Sections 4—6 concern what happens when the assumption of semisimplicity is dropped but some 
finiteness condition is maintained. Section 4 introduces the Wedderburn—Artin radical rad R of a 
left Artinian ring R as the sum of all nilpotent left ideals. The radical is a two-sided nilpotent ideal. 
It is O if and only if the ring is semisimple. More generally R/rad R is always semisimple if R is 
left Artinian. Sections 5—6 state and prove Wedderburn’s Main Theorem — that a finite-dimensional 
algebra R with identity over a field F of characteristic 0 has a semisimple subalgebra S such that R 
is isomorphic as a vector space to S @ rad R. The semisimple algebra S is isomorphic to R/rad R. 
Section 5 gives the hard part of the proof, which handles the special case that R/ rad R is isomorphic 
to a product of full matrix algebras over F'. The remainder of the proof, which appears in Section 6, 
follows relatively quickly from the special case in Section 5 and an investigation of circumstances 
under which the tensor product over F of two semisimple algebras is semisimple. Such a tensor 
product is not always semisimple, but it is semisimple in characteristic 0. 

The results about tensor products in Section 6, but with other hypotheses in place of the condition 
of characteristic 0, play a role in the remainder of the chapter, which is aimed at identifying certain 
division rings. Sections 7—8 provide general tools. Section 7 begins with further results about tensor 
products. Then the Skolem—Noether Theorem gives a relationship between any two homomorphisms 
of a simple subalgebra into a simple algebra whose center coincides with the underlying field of 
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scalars. Section 8 proves the Double Centralizer Theorem, which says for this situation that the 
centralizer of the simple subalgebra in the whole algebra is simple and that the product of the 
dimensions of the subalgebra and the centralizer is the dimension of the whole algebra. 

Sections 9-10 apply the results of Sections 6-8 to obtain two celebrated theorems — Wedderburn’s 
Theorem about finite division rings and Frobenius’s Theorem classifying the finite-dimensional 
associative division algebras over the reals. 


1. Historical Motivation 


Elementary ring theory came from several sources historically and was already in 
place by 1880. Some of the sources are field theory (studied by Galois and others), 
rings of algebraic integers (studied by Gauss, Dirichlet, Kummer, Kronecker, 
Dedekind, and others), and matrices (studied by Cayley, Hamilton, and others). 
More advanced general ring theory arose initially not on its own but as an effort 
to imitate the theory of “Lie algebras,” which began about 1880. 

A brief summary of some early theorems about Lie algebras will put matters 
in perspective. The term “algebra” in connection with a field F refers at least to 
an F vector space with a multiplication that is F bilinear. This chapter will deal 
only with two kinds of such algebras, the Lie algebras and those algebras whose 
multiplication is associative. If the modifier “Lie” is absent, the understanding is 
that the algebra is associative. 

Lie algebras arose originally from “Lie groups”—which we can regard for 
current purposes as connected groups with finitely many smooth parameters — 
by a process of taking derivatives along curves at the identity element of the 
group. Precise knowledge of that process will be unnecessary in our treatment, 
but we describe one example: The vector space M,,(R) of all n-by-n matrices over 
R becomes a Lie algebra with multiplication defined by the “bracket product” 
[X,Y] = XY — YX. If Gis a closed subgroup of the matrix group GL(n, R) 
and g is the set of all members of M,,(IR) of the form X = c’(0), where c is a 
smooth curve in G with c(O) equal to the identity, then it turns out that the vector 
space g is closed under the bracket product and is a Lie algebra. Although one 
might expect the Lie algebra g to give information about the Lie group G only 
infinitesimally at the identity, it turns out that g determines the multiplication rule 
for G in a whole open neighborhood of the identity. Thus the Lie group and Lie 
algebra are much more closely related than one might at first expect. 


We turn to the underlying definitions and early main theorems about Lie alge- 
bras. Let F be a field. A vector space A over F with an F bilinear multiplication 
(X, Y) +> [X, Y] is a Lie algebra if the multiplication has the two properties 

(i) [X, X] =O forall X € A, 
(ii) (Jacobi identity) LX, [Y, Z]] + [Y,[Z, X]] + [Z, [X, Y]] = 0 for all 
X,Y,ZEA. 
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Multiplication is often referred to as bracket. It is usually not associative. The 
vector space M,,(F’) with [X,Y] = XY — YX is a Lie algebra, as one easily 
checks by expanding out the various brackets that are involved; it is denoted by 
gl(n, F). 

The elementary structural definitions with Lie algebras run parallel to those 
with rings. A Lie subalgebra S of A is a vector subspace closed under brackets, 
an ideal J of A is a vector subspace such that [X, Y]isin/ for X ¢ J andY € A, 
a homomorphism ¢ : A; — Az of Lie algebras is a linear mapping respecting 
brackets in the sense that g[X, Y] = [g(X), p(Y)] for all X, Y € Aj, and an 
isomorphism is an invertible homomorphism. Every ideal is a Lie subalgebra. 
In contrast to the case of rings, there is no distinction between “left ideals” and 
“right ideals” because the bracket product is skew symmetric. Under the passage 
from Lie groups to Lie algebras, abelian Lie groups yield Lie algebras with all 
brackets 0, and thus one says that a Lie algebra is abelian if all its brackets are 0. 

Examples of Lie subalgebras of gl(n, F) are the subalgebra sl(n, F) of all 
matrices of trace 0, the subalgebra so(n, F’) of all skew-symmetric matrices, and 
the subalgebra of all upper-triangular matrices. 

The elementary properties of subalgebras, homomorphisms, and so on for Lie 
algebras mimic what is true for rings: The kernel of a homomorphism is an 
ideal. Any ideal is the kernel of a quotient homomorphism. If / is an ideal in 
A, then the ideals of A/J correspond to the ideals of A containing /, just as 
in the First Isomorphism Theorem for rings. If J and J are ideals in A, then 
d+ J)/1 = J/U 0 J), just as in the Second Isomorphism Theorem for rings. 

The connection of Lie algebras to Lie groups makes one want to introduce 
definitions that lead toward classifying all Lie algebras that are finite-dimensional. 
We therefore assume for the remainder of this section that all Lie algebras under 
discussion are finite-dimensional over F. Some of the steps require conditions 
on F’, and we shall assume that F has characteristic 0. 

Group theory already had a notion of “solvable group” from Galois, and this 
leads to the notion of solvable Lie algebra. In A, let [A, A] denote the linear span 
of all [X, Y] with X, Y € A; [A, A] is called the commutator ideal of A, and 
A/[A, A] is abelian. In fact, [A, A] is the smallest ideal J in A such that A/J 
is abelian. Starting from A, let us form successive commutator ideals. Thus put 
Ap = A, Aj = [Aop, Aol,... , An = [An_1, An—1], So that 


A=Ap> A, D°-::-DA,D-=::. 


The terms of this sequence are all the same from some point on, by finite dimen- 
sionality, and we say that A is solvable if the terms are ultimately 0. One easily 
checks that the sum J + J of two solvable ideals in A, i.e., the set of sums, is 
a solvable ideal. By finite dimensionality, there exists a unique largest solvable 
ideal. This is called the radical of A and is denoted by rad A. The Lie algebra 
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A is said to be semisimple if rad A = 0. It is easy to use the First Isomorphism 
Theorem to check that A/rad A is always semisimple. 

In the direction of classifying Lie algebras, one might therefore want to see how 
all solvable Lie algebras can be constructed by successive extensions, identify 
all semisimple Lie algebras, and determine how a general Lie algebra can be 
constructed from a semisimple Lie algebra and a solvable Lie algebra by an 
extension. 

The first step in this direction historically concerned identifying semisimple 
Lie algebras. We say that the Lie algebra A is simple if dim A > 1 and if A 
contains no nonzero proper ideals. 

Working with the field C but in a way that applies to other fields of 
characteristic 0, W. Killing proved in 1888 that A is semisimple if and only 
if A is the (internal) direct sum of simple ideals. In this case the direct summands 
are unique, and the only ideals in A are the partial direct sums. 

This result is strikingly different from what happens for abelian Lie algebras, 
for which the theory reduces to the theory of vector spaces. A 2-dimensional 
vector space is the internal direct sum of two 1-dimensional subspaces in many 
ways. But Killing’s theorem says that the decomposition of semisimple Lie 
algebras into simple ideals is unique, not just unique up to some isomorphism. 

E. Cartan in his 1894 thesis classified the simple Lie algebras, up to isomor- 
phism, for the case that the field is C. The Lie algebras sl(n, C) for n > 2 and 
so(n, C) forn = 3 and n > 5 were in his list, and there were others. Killing had 
come close to this classification in his 1888 work, but he had made a number of 
errors in both his statements and his proofs. 

E. E. Levi in 1905 addressed the extension problem for obtaining all finite- 
dimensional Lie algebras over C from semisimple ones and solvable ones. His 
theorem is that for any Lie algebra A, there exists a subalgebra S isomorphic to 
A/rad A such that A = S @ rad A as vector spaces. In essence, this result says 
that the extension defining A is given by a semidirect product. 

The final theorem in this vein at this time in history was a 1914 result of Cartan 
classifying the simple Lie algebras when the field F is R. This classification is a 
good bit more complicated than the classification when F is C. 


With this background in mind, we can put into context the corresponding 
developments for associative algebras. Although others had done some earlier 
work, J. H.M. Wedderburn made the first big advance for associative algebras in 
1905. Wedderburn’s theory in a certain sense is more complicated than the theory 
for Lie algebras because left ideals in the associative case are not necessarily two- 
sided ideals. Let us sketch this theory. 

For the remainder of this section until the last paragraph, A will denote a finite- 
dimensional associative algebra over a field F of characteristic 0, possibly the 0 
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algebra. We shall always assume that A has an identity. Although we shall make 
some definitions here, we shall repeat them later in the chapter at the appropriate 
times. For many results later in the chapter, the field F will not be assumed to be 
of characteristic 0. 

As in Chapter X of Basic Algebra, a unital left A module M is said to be simple 
if itis nonzero and it has no proper nonzero A submodules, semisimple if it is the 
sum (or equivalently the direct sum) of simple A submodules. The algebra A is 
semisimple if the left A module A is a semisimple module, i.e., if A is the direct 
sum of simple left ideals; A is simple if it is nonzero and has no nontrivial two- 
sided ideals. In contrast to the setting of Lie algebras, we make no exception for 
the 1-dimensional case; this distinction is necessary and is continually responsible 
for subtle differences between the two theories. 

Wedderburn’s first theorem has two parts to it, the first one modeled on Killing’s 
theorem for Lie algebras and the second one modeled on Cartan’s thesis: 


(i) The algebra A is semisimple if and only if it is the (internal) direct sum 
of simple two-sided ideals. In this case the direct summands are unique, 
and the only two-sided ideals of A are the partial direct sums. 

(ii) The algebra A is simple if and only if A = M,,(D) for some integern > 1 
and some division algebra D over F. In particular, if F is algebraically 
closed, then A = M,,(F) for some n. 


E. Artin generalized the Wedderburn theory to a suitable kind of “semisimple 
ring.” For part of the theory, he introduced a notion of “radical” for the associative 
case—the radical of a finite-dimensional associative algebra A being the sum of 
the “nilpotent” left ideals of A. Here a left ideal J is called nilpotent if J‘ = 0 
for some k. The radical rad A is a two-sided ideal, and A/ rad A is a semisimple 
ring. 

Wedderburn’s Main Theorem, proved later in time and definitely assuming 
characteristic 0, is an analog for associative algebras of Levi’s result about Lie 
algebras. The result for associative algebras is that A decomposes as a vector- 
space direct sum A = S @rad A, where S is a semisimple subalgebra isomorphic 
to A/rad A. 


The remaining structural question for finite-dimensional associative algebras 
is to say something about simple algebras when the field is not algebraically 
closed. Such a result may be regarded as an analog of the 1914 work by Cartan. 
In the associative case one then wants to know what the F isomorphism classes of 
finite-dimensional associative division algebras D are fora given field F. We now 
drop the assumption that the field F has characteristic 0. In asking this question, 
one does not want to repeat the theory of field extensions. Consequently one 
looks only for classes of division algebras whose center is F’. If F is algebraically 
closed, the only such D is F itself, as we shall observe in more detail in Section 2. 
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If F is a finite field, one is led to another theorem of Wedderburn’s, saying that D 
has to be commutative and hence that D = F;; this theorem appears in Section 9. 
If F is R, one is led to a theorem of Frobenius saying that there are just two such 
D’s up to R isomorphism, namely R itself and the quaternions H; this theorem 
appears in Section 10. For a general field F’, it turns out that the set of classes 
of finite-dimensional division algebras with center F forms an abelian group. 
The group is called the “Brauer group” of F’. Its multiplication is defined by the 
condition that the class of D; times Dz is the class of a division algebra D3 such 
that D; ®r D2» = M,,(D3) for some n; the inverse of the class of D is the class 
of the opposite algebra D°, and the identity is the class of F. The study of the 
Brauer group is postponed to Chapter III. This group has an interpretation in terms 
of cohomology of groups, and it has applications to algebraic number theory. 


2. Semisimple Rings and Wedderburn’s Theorem 


We now begin our detailed investigation of associative algebras over a field. In 
this section we shall address the first theorem of Wedderburn’s that is mentioned 
in the previous section. It has two parts, one dealing with semisimple algebras 
and one dealing with finite-dimensional simple algebras. The first part does not 
need the finite dimensionality as a hypothesis, and we begin with that one. 

Let R be a ring with identity. The ring R is left semisimple if the left R 
module R is a semisimple module, i.e., if R is the direct sum of minimal left 
ideals.' In this case R = @j<5 1; for some set S and suitable minimal left 
ideals J;. Since R has an identity, we can decompose the identity according to 
the direct sum as 1 = 1;, +--- + 1;, for some finite subset {i,,...,i,} of S, 
where 1;, is the component of 1 in J;,. Multiplying by r € R on the left, we 
see that R C @y_, /i,. Consequently R has to be a finite sum of minimal left 
ideals. A ring R with identity is right semisimple if the right R module R is a 
semisimple module. We shall see later in this section that left semisimple and 
right semisimple are equivalent. 


EXAMPLES OF SEMISIMPLE RINGS. 


(1) If D is a division ring, then we saw in Example 4 in Section X.1 of Basic 
Algebra that the ring R = M,,(D) is left semisimple in the sense of the above 
definition. Actually, that example showed more. It showed that R as a left R 
module is given by M,(D) = D’ @--- ® D", where each D” is a simple left R 
module and the j* summand D” corresponds to the matrices whose only nonzero 
entries are in the j column. The left R module M,,(D) has a composition series 
whose terms are the partial sums of the n summands D”. If M is any simple 
left M,,(D) module and if x 4 0 is in M, then M = M,(D)x. If we set 
I= {r € M,(D) | rx = 0}, then J is a left ideal in M,(D) and M = M,(D)/I 


'By convention, a “minimal left ideal” always means a “minimal nonzero left ideal.” 
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as a left M,(D) module. In other words, M is an irreducible quotient module 
of the left M,(D) module M,,(D). By the Jordan—Hoélder Theorem (Corollary 
10.7 of Basic Algebra), M occurs as a composition factor. Hence M = D" as 
a left M,(D) module. Hence every simple left M@,,(D) module is isomorphic to 
D". We shall use this style of argument repeatedly but will ordinarily include 
less detail. 

(2) If Rj,...,R, are left semisimple rings, then the direct product R = 
[];_, 8: is left semisimple.’ In fact, each minimal left ideal of R;, when included 
into R, is a minimal left ideal of R. Hence R is the sum of minimal left ideals 
and is left semisimple. By the same kind of argument as for Example 1, every 
simple left R module is isomorphic to one of these minimal left ideals. 


Lemma 2.1. Let D be a division ring, let R = M,(D), and let D” be the 
simple left R module of column vectors. Each member of D acts on D” by 
scalar multiplication on the right side, yielding a member of Endr(D”). In turn, 
Endpr(D”) is aring, and this identification therefore is an inclusion of the members 
of D into the right D module Endr(D"). The inclusion is in fact an isomorphism 
of rings: D° = Endr(D"), where D°® is the opposite ring of D. 


PROOF. Let gy : D — Endp(D") be the function given by g(d)(v) = vd. 
Then y(dd’)(v) = v(dd') = (vd)d' = o(d’)(vd) = v(d’)(g(d)(v)). Since the 
order of multiplication in D is reversed by ¢, ¢ is a ring homomorphism of D° 
into Endr(D”). It is one-one because D? is a division ring and has no nontrivial 
two-sided ideals. To see that it is onto Endrg(D”), let f be in Endr(D”). Put 


1 d 
0 ad . . . 
f|.)]= I]. J. Since f is an R module homomorphism, 
0 dn 
a a, 0-0 1 0 --- 0 1 
a az 0-0 20-0 0 
An a, 0-0 0 a, 0+ 0 0 
a 0-0 d aid ay 
a 0--0 dy aod a 
= J|= = y(d) 
an O-- 0 dn and an 


Therefore y(d) = f, and ¢ is onto. 


?Some comment is appropriate about the notation R = []/_, R; and the terminology “direct 
product.” Indeed, []/_, Rj is a product in the sense of category theory within the category of rings 
or the category of rings with identity. Sometimes one views R alternatively as built from n two-sided 
ideals, each corresponding to one of the n coordinates; in this case, one may say that R is the “direct 
sum” of these ideals. This direct sum is to be regarded as a direct sum of abelian groups, or perhaps 
vector spaces or R modules, but it is not a coproduct within the category of rings with identity. 
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Theorem 2.2 (Wedderburn). If R is any left semisimple ring, then 


R © My,,(D1) x +» X Mn, (Dy) 


for suitable division rings D;,..., D, and positive integers n1,...,n,. The num- 
ber r is uniquely determined by R, and the ordered pairs (m1, D1),..., (n+, D;) 
are determined up to a permutation of {1,...,7} and an isomorphism of each 


D,. There are exactly r mutually nonisomorphic simple left R modules, namely 
(D1), aides COs 


PROOF. Write R as the direct sum of minimal left ideals, and then regroup 
the summands according to their R isomorphism type as R = Din nj V;, where 
njVj; is the direct sum of n; submodules R isomorphic to V; and where V; ¥ V; 
fori ¢ j. The isomorphism is one of unital left R modules. Put D? = Endr(V;). 
This is a division ring by Schur’s Lemma (Proposition 10.4b of Basic Algebra). 
Using Proposition 10.14 of Basic Algebra, we obtain an isomorphism of rings 


R° Endy R X Hom, (Bnivi, ®n,V)). (x) 
i=l j=l 


Define p; : Dini njV; — njV; to be the i projection and gq; : njV; > 
Di-1 njV; to be the i" inclusion. Let us see that the right side of (x) is iso- 
morphic as a ring to [], Endr(ni Vj) via the mapping fh (pi fqi,..., pr fr). 
What is to be shown is that p; fq; = 0 fori # j. Here pj; fq; is a member 
of Home (nj Vj,n;V;). The abelian group Home (nj V;,n;V;) is the direct sum 
of abelian groups isomorphic to Home(V;, V;) by Proposition 10.12, and each 
Hom, (V;, V;) is 0 by Schur’s Lemma (Proposition 10.4a). 

Referring to («), we therefore obtain ring isomorphisms 


RS T] Homnie(s Vis niVi) = UT Eade GX) 


~ 


Ile 


M,, (Endr(V;)) by Corollary 10.13 
1 


il 


rr 
—Is 
= 
S 


ey by definition of D?. 


i 


Reversing the order of multiplication in R° and using the transpose map to 
reverse the order of multiplication in each M,,(D?), we conclude that R = 
ame M,, (D;). This proves existence of the decomposition in the theorem. 

We still have to identify the simple left R modules and prove an appropriate 
uniqueness statement. As we recalled in Example 1, we have a decomposition 
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M,,(Di) = Dj" ® +++ ® D}" of left My,(D;) modules, and each term D’ is a 
simple left M,,,(D;) module. The decomposition just proved allows us to regard 
each term Di as a simple left R module, 1 <i <r. Each of these modules 
is acted upon by a different coordinate of R, and hence we have produced at 
least r nonisomorphic simple left R modules. Any simple left R module must 
be a quotient of R by a maximal left ideal, as we observed in Example 2, hence 
a composition factor as a consequence of the Jordan—Hélder Theorem. Thus 
it must be one of the V;’s in the previous part of the proof. There are only 
r nonisomorphic such V;’s, and we conclude that the number of simple left R 
modules, up to isomorphism, is exactly r. 

For uniqueness suppose that R = M, (D{) x -+- x Mn (D{) as rings. Let 
Vi= Dy be the unique simple left M, (D;) module up to isomorphism, and 
regard V/ as a simple left R module. Then we have R = Dia nV? as left 
R modules. By the Jordan—Hélder Theorem we must have r = s and, after a 
suitable renumbering, nj; = n’, and V; = V/ for 1 <i <r. Thus we have ring 
isomorphisms 


Dye Endy, wi (V;) by Lemma 2.1 
~ Enda (V)) 
= Endr(V,) since V; = V/ 
= DP. 


Reversing the order of multiplication gives D; = Dj, and the proof is complete. 


Corollary 2.3. For a ring R, left semisimple coincides with right semisimple. 


REMARK. Therefore we can henceforth refer to left semisimple rings unam- 
biguously as semisimple. 


PROOF. The theorem gives the form of any left semisimple ring, and each ring 
of this form is certainly right semisimple. 


Wedderburn’s original formulation of Theorem 2.2 was for algebras over a 
field F , and he assumed finite dimensionality. The theorem in this case gives 


R = Mn,(D1) x +++ x Mn, (Dr), 


and the proof shows that D? = Endr(V;), where V; is a minimal left ideal of 
R of the i isomorphism type. The field F lies inside Endp(V;), each member 
of F yielding a scalar mapping, and hence each D;, is a division algebra over 
F.. Each D; is necessarily finite-dimensional over F’, since R was assumed to be 
finite-dimensional. 
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We shall make occasional use in this chapter of the fact that if D is a finite- 
dimensional division algebra over an algebraically closed field F, then D = F. 
To see this equality, suppose that x is a member of D but not of F, i.e., is not an 
F multiple of the identity. Then x and F together generate a subfield F(x) of D 
that is a nontrivial algebraic extension of F’, contradiction. Consequently every 
finite-dimensional semisimple algebra R over an algebraically closed field F is 
of the form 

R= My),(F) x +++ X Mn, (F), 


for suitable integers n1,...,7;. 

As we saw, the finite dimensionality plays no role in decomposing semisim- 
ple rings as the finite product of rings that we shall call “simple.” The place 
where finite dimensionality enters the discussion is in identifying simple rings 
as semisimple, hence in establishing a converse theorem that every finite direct 
product of simple rings, each equal to an ideal of the given ring, is necessarily 
semisimple. We say that a nonzero ring R with identity is simple if its only 
two-sided ideals are 0 and R. 


EXAMPLES OF SIMPLE RINGS. 


(1) If D is a division ring, then M,,(D) is a simple ring. In fact, let J be a 
two-sided ideal in M,,(D), fix an ordered pair (i, 7) of indices, and let 


I = {x € D | some member X of J has X;; = x}. 


Multiplying X in this definition on each side by scalar matrices with entries in 
D, we see that J is a two-sided ideal in D. If J = 0 for all (i, 7), then J = 0. 
So assume for some (i, j) that J 4 0. Then J = D for that (7, 7), and we may 
suppose that some X in J has X;; = 1. If Ej; denotes the matrix that is 1 in 
the (k, J)™ place and is 0 elsewhere, then Ej; X Ej; = E;; has to be in J. Hence 
Eq = Ex, Ej; Ej has to be in J, and J = M,(D). 


(2) Let R be the Weyl algebra over C in one variable, namely 


= | os Py“) | each P, is in C[x], and the sum is finite} 
n>0 


To give a ae suche construction of R, we can view R as C[x subject to 


: al 
the relation 4 ye SEX a4 1; this is not to be a quotient of a polynomial algebra 
in two qariabies but a ‘iotient of a tensor algebra in two variables. We omit the 
details. We shall now prove that the ring R is simple but not semisimple. 


To see that R is a simple ring, we ue ages the two identities 


: d d” £, 
@) ee" Ga) = mx” 


by the product rule, 


aes 


(ii) £ x=n ci +x a “, by induction when applied to a polynomial f(x). 
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Let J be a nonzero two-sided ideal in R, and fix an element X 4 0 in J. Let x” 
be the highest power of x appearing in X, and let So be the highest power of £ 
appearing in terms of X involving x”. Let / andr denote “left multiplication by” 
and “right multiplication by,” and apply (/ (+) - r( d yy” to X. Since (i) shows 


dx 
that 


(5) — ee) = NEY 
the result of computing (/ (+) = r(4))"x is a polynomial in 7 of degree 


exactly n with no x’s. Application of (r(x) — /(x))” to the result, using (ii), 
yields a nonzero constant. We conclude that 1 is in J and therefore that J = R. 
Hence R is simple. 

To show that R is not semisimple, first note that C[x] is a natural unital left R 
module. We shall show that R has infinite length as a left R module, in the sense 
of the length of finite filtrations. In fact, 


R2R(#) IRA) D+ BRK)" (*) 
is a finite filtration of left R submodules of R. If R(4)' = Rear then 


(2) = neo te for some r € R. Applying these two equal expressions for 
a member of R to the member x* of the left R module C[x], we arrive at a 
contradiction and conclude that every inclusion in (*) is strict. Therefore R has 


infinite length and is not semisimple. 


The extra hypothesis that Wedderburn imposed so that simple rings would 
turn out to be semisimple is finite dimensionality. Wedderburn’s result in this 
direction is Theorem 2.4 below. This hypothesis is quite natural to the extent 
that the subject was originally motivated by the theory of Lie algebras. E. Artin 
found a substitute for the assumption of finite dimensionality that takes the result 
beyond the realm of algebras, and we take up Artin’s idea in the next section. 


Theorem 2.4 (Wedderburn). Let R be a finite-dimensional algebra with 
identity over a field F. If R is a simple ring, then R is semisimple and hence 
is isomorphic to M,(D) for some integer n > 1 and some finite-dimensional 
division algebra D over F’. The integer n is uniquely determined by R, and D is 
unique up to isomorphism. 


PROOF. By finite dimensionality, R has a minimal left ideal V. Forr in R, 
form the set Vr. This is a left ideal, and we claim that it is minimal or is 0. In 
fact, the function v + ur is R linear from V onto Vr. Since V is simple as a 
left R module, Vr is simple or 0. The sum J = 97, with vyrzo V1 is a two-sided 
ideal in R, and it is not 0 because V1 4 0. Since R is simple, J = R. Then the 
left R module R is exhibited as the sum of simple left R modules and is therefore 
semisimple. The isomorphism with M,,(D) and the uniqueness now follow from 
Theorem 2.2. 
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3. Rings with Chain Condition and Artin’s Theorem 


Parts of Chapters VIII and IX of Basic Algebra made considerable use of a 
hypothesis that certain commutative rings are “Noetherian,” and we now extend 
this notion to noncommutative rings. A ring R with identity is left Noetherian if 
the left R module R satisfies the ascending chain condition for its left ideals. It is 
left Artinian if the left R module R satisfies the descending chain condition for 
its left ideals. The notions of right Noetherian and right Artinian are defined 
similarly. 

We saw many examples of Noetherian rings in the commutative case in Basic 
Algebra. The ring of integers Z is Noetherian, and so is the ring of polynomials 
R[X] in an indeterminate over a nonzero Noetherian ring R. It follows from the 
latter example that the ring F[X1,..., X»] in finitely many indeterminates over 
a field is a Noetherian ring. Other examples arose in connection with extensions 
of Dedekind domains. 

Any finite direct product of fields is Noetherian and Artinian because it has a 
composition series and because its ideals therefore satisfy both chain conditions. 
If p is any prime, the ring Z/ p*Z is Noetherian and Artinian for the same reason, 
and it is not a direct product of fields. 

In the noncommutative setting, any semisimple ring is necessarily left Noe- 
therian and left Artinian because it has a composition series for its left ideals and 
the left ideals therefore satisfy both chain conditions. 


Proposition 2.5. Let R be a ring with identity, and let M be a finitely generated 
unital left R module. If R is left Noetherian, then M satisfies the ascending 
chain condition for its R submodules; if R is left Artinian, then M satisfies the 
descending chain condition for its R submodules. 


PROOF. We prove the first conclusion by induction on the number of generators, 
and the proof of the second conclusion is completely similar. The result is trivial 
if M has O generators. If M = Rx, then M is a quotient of the left R module 
R and satisfies the ascending chain condition for its R submodules, according to 
Proposition 10.10 of Basic Algebra. For the inductive step with > 2 generators, 
write M = Rx, +---+ Rx, and N = Rx, +---+ Rxy,_1. Then N satisfies 
the ascending chain condition for its R submodules by the inductive hypothesis, 
and M/N is isomorphic to Rx,/(N M Rx,), which satisfies the ascending chain 
condition for its R submodules by the inductive hypothesis. Therefore M satisfies 
the ascending chain condition for its R submodules by application of the converse 
direction of Proposition 10.10. 


Artin’s theorem (Theorem 2.6 below) will make use of the hypothesis “left 
Artinian” in identifying those simple rings that are semisimple. The hypothesis 


88 II. Wedderburn—Artin Ring Theory 


left Artinian may therefore be regarded as a useful generalization of finite dimen- 
sionality. Before we come to that theorem, we give a construction that produces 
large numbers of nontrivial examples of such rings. 


EXAMPLE (triangular rings). Let R and S be nonzero rings with identity, and 
let M be an (R, S) bimodule? Define a set A and operations of addition and 
multiplication symbolically by 


R M rom 
4=(9 S)={(0 4) 

ith rm rom \ _ frr’ rm’ +ms' 
= 0 s 0 s'} \O ss! ; 


Then A is a ring with identity, the bimodule property entering the proof of 
associativity of multiplication in A. We can identify R, M, and S$ with the 


additive subgroups of A given by G ) G ae and ( “ Problems 8-11 at 


the end of the chapter ask one to check the following facts: 
(i) The left ideals in A are of the form J; @ In, where J) is a left ideal in S$ 
and J; is a left R submodule of R @ M containing M Jy. 
(ii) The right ideals in A are of the form J; @ Jz, where J; is a right ideal in 
R and Jz is aright S submodule of M @ S containing J, M. 
(iii) The ring A is left Noetherian if and only if R and S are left Noetherian 
and M satisfies the ascending chain condition for its left R submodules. 
The ring A is right Noetherian if and only if R and S are right Noetherian 
and M satisfies the ascending chain condition for its right S submodules. 
(iv) The previous item remains valid if “Noetherian” is replaced by 
“Artinian” and “ascending” is replaced by “descending.” 


reR,memM,s s| 


(v 


wa 


If A = @ .) is a ring such as Ce ) in which S is a (commutative) 
Noetherian integral domain with field of fractions R and if S # R, then 
A is left Noetherian and not right Noetherian, and A is neither left nor 
right Artinian. 

(vi) IfA = ( ‘i ) is aring such as ( ae a : ) in which R and S are fields with 


S C R and dims R infinite, then A is left Noetherian and left Artinian, 
and A is neither right Noetherian nor right Artinian. 


From these examples we see, among other things, that “left” and “right” are 
somewhat independent for both the Noetherian and the Artinian conditions. We 


3This means that M is an abelian group with the structure of a unital left R module and the 
structure of a unital right S module in such a way that (rm)s = r(ms) for allr €¢ R,m € M, and 
ses. 
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already know from the commutative case that Noetherian does not imply Artinian, 
Z being a counterexample. We shall see in Theorem 2.15 later that left Artinian 
implies left Noetherian and that right Artinian implies right Noetherian. 


Theorem 2.6 (E. Artin). If R is a simple ring, then the following conditions 
are equivalent: 


(a) R is left Artinian, 

(b) R is semisimple, 

(c) R has a minimal left ideal, 

(d) R = M,,(D) for some integer n > | and some division ring D. 


In particular, a left Artinian simple ring is right Artinian. 


REMARK. Theorem 2.4 is a special case of the assertion that (a) implies 
(d). In fact, if R is a finite-dimensional algebra over a field F’, then the finite 
dimensionality forces R to be left Artinian. 


PROOF. It is evident from Wedderburn’s Theorem (Theorem 2.2) that (b) and 
(d) are equivalent. For the rest we prove that (a) implies (c), that (c) implies (b), 
and that (b) implies (a). 

Suppose that (a) holds. Applying the minimum condition for left ideals in R, 
we obtain a minimal left ideal. Thus (c) holds. 

Suppose that (c) holds. Let V be a minimal left ideal. Then the sum J = 
>--er Vr is a two-sided ideal in R, and it is nonzero because the term for r = 1 
is nonzero. Since R is simple, J] = R. Then the left R module R is spanned by 
the simple left R modules Vr, and R is semisimple. Thus (b) holds. 

Suppose that (b) holds. Since R is semisimple, the left R module R has a 
composition series. Then the left ideals in R satisfy both chain conditions, and it 
follows that R is left Artinian. Thus (a) holds. 


4. Wedderburn—Artin Radical 


In this section we introduce one notion of “radical” for certain rings with identity, 
and we show how it is related to semisimplicity. This notion, the “Wedderburn— 
Artin radical,’ is defined under the hypothesis that the ring is left Artinian. It is 
not the only notion of radical studied by ring theorists, however. There is a useful 
generalization, known as the “Jacobson radical,” that is defined for arbitrary rings 
with identity. We shall not define and use the Jacobson radical in this text. 

Fix a ring R with identity. A nilpotent element in R is an element a with 
a" = 0 for some integer n > 1. A nil left ideal is a left ideal in which every 
element is nilpotent; nil right ideals and nil two-sided ideals are defined similarly. 
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A nilpotent left ideal is a left ideal 7 such that /” = 0 for some integer n > 1, 
i.e., for which aj --- da, = 0 for all n-fold products of elements from /; nilpotent 
right ideals and nilpotent two-sided ideals are defined similarly. 


Lemma 2.7. If J; and /> are nilpotent left ideals in a ring R with identity, then 
I, + Jy is nilpotent. 


PROOF. Let Jj = 0 and J; = 0. Expand (J; + In)‘ as YU Ti, +++ T;, with each 
i; equal to 1 or 2. Take k =r +s. In any term of the sum, there are > r indices | 
or > s indices 2. In the first case let there be ¢ indices 2 at the right end. Since 
Inl, C I, we can absorb all other indices 2, and the term of the sum is contained 
in /{ 1; = 0. Similarly in the second case if there are ¢’ indices | at the right end, 
then the term is contained in /5 1{ ‘=0. 


Lemma 2.8. If / is a nilpotent left ideal in a ring R with identity, then J is 
contained in a nilpotent two-sided ideal J. 

PROOF. Put J = 0.2 Ir. This is a two-sided ideal. For any integer k > 0, 
Je = (Yyegir’ SY, Inn: Im © Y,, Tire. If I = 0, then 
Jk =0. 


Lemma 2.9. If R is a ring with identity, then the sum of all nilpotent left ideals 
in a nil two-sided ideal. 


PROOF. Let K be the sum of all nilpotent left ideals in R, and let a be a member 
of K. Write a =a; +---+ a, with a; € J; for a nilpotent left ideal 7;. Lemma 
2.7 shows that J = )~7_, J; is anilpotent left ideal. Since a is in J ,a is a nilpotent 
element. 

The set K is certainly a left ideal, and we need to see that aR is in K in order to 
see that K is a two-sided ideal. Lemma 2.8 shows that J C J for some nilpotent 
two-sided ideal J. Then J C K because J is one of the nilpotent left ideals 
whose sum is K. Since a is in J and therefore in J and since J is a two-sided 
ideal, aR is contained in J. Therefore aR is contained in K , and K is atwo-sided 
ideal. 


Theorem 2.10. If R is a left Artinian ring, then any nil left ideal in R is 
nilpotent. 


REMARK. Readers familiar with a little structure theory for finite-dimensional 
Lie algebras will recognize this theorem as an analog for associative algebras of 
Engel’s Theorem. 


PROOF. Let J be a nil left ideal of R, and form the filtration 


PT SIS ees 
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Since R is left Artinian, this filtration is constant from some point on, and we 
have [* = J+! = 7+? —... for some k > 1. Put J = I*. We shall show that 
J = 0, and then we shall have proved that J is a nilpotent ideal. 

Suppose that J #0. Since J* = 1** = J* = J, we have J* = J. Thus the 
left ideal J has the property that J J #0. Since R is left Artinian, the set of left 
ideals K C J with JK 4 0 has a minimal element Ky. Choose a € Kg with 
Ja #0. Since Ja C JKg C Ko and J(Ja) = J’a = Ja #0, the minimality 
of Ko implies that Ja = Ko. Thus there exists x € J with xa = a. Applying 
powers of x, we obtain x"a = a for every integer n > 1. But x is a nilpotent 
element, being in /, and thus we have a contradiction. 


Corollary 2.11. If R is a left Artinian ring, then there exists a unique largest 
nilpotent two-sided ideal J in R. This ideal is the sum of all nilpotent left ideals 
and also is the sum of all nilpotent right ideals. 


REMARKS. The two-sided ideal J of the corollary is called the Wedderburn- 
Artin radical of R and will be denoted by rad R. This exists under the hypothesis 
that R is left Artinian. 


PROOF. By Lemma 2.9 and Theorem 2.10 the sum of all nilpotent left ideals in 
R is atwo-sided nilpotent ideal 7. Lemma 2.8 shows that any nilpotent right ideal 
is contained in a nilpotent two-sided ideal J. Since J is in particular a nilpotent 
left ideal, the definition of J forces J C J. Hence the sum of all nilpotent right 
ideals is contained in /. But / itself is a nilpotent right ideal and hence equals 
the sum of all the nilpotent right ideals. 


Lemma 2.12 (Brauer’s Lemma). If R is any ring with identity and if V is a 
minimal left ideal in R, then either V? = 0 or V = Re for some element e of V 
with e* =e. 


REMARK. Anelement e with the property that e” = e is said to be idempotent. 


PROOF. Being a minimal left ideal, V is a simple left R module. Schur’s 
Lemma (Proposition 10.4b of Basic Algebra) shows that Endp V is a division 
ring. If a is in V, then the map v b> va of V into itself lies in Ende V and hence 
is the 0 map or is one-one onto. If it is the 0 map for all a € V, then V? = 0. 
Otherwise suppose that a is an element for which v > va is one-one onto. Then 
there exists e € V with ea = a. Multiplying on the left by e gives e*a = ea and 
therefore (e* — e)a = 0. Since the map v +> va is assumed to be one-one onto, 
we must have e” — e = 0 and e” =e. 


Theorem 2.13. If R is a left Artinian ring and if the Wedderburn—Artin radical 
of R is O, then R is a semisimple ring. 


92 II. Wedderburn—Artin Ring Theory 


REMARKS. Conversely semisimple rings are left Artinian and have radical 0. 
In fact, we already know that semisimple rings have a composition series for 
their left ideals and hence are left Artinian. To see that the radical is 0, apply 
Theorem 2.2 and write the ring as R = M,,(D,) x--: x My, (D,). The two-sided 
ideals of R are the various subproducts, with 0 in the missing coordinates. Such a 
subproduct cannot be nilpotent as an ideal unless it is 0, since the identity element 
in any factor is not a nilpotent element in R. 


PROOF. Let us see that any minimal left ideal J of R is a direct summand as a 
left R submodule. Since rad R = 0, / is not nilpotent. Thus / 7 4 0, and Lemma 
2.12 shows that J contains an idempotent e. This element satisfies J = Re. Put 
I’ = {r € R| re = 0}. Then J’ is a left ideal in R. Since J’ J C J and e is 
not in J’, the minimality of J forces 7’N J = 0. Writing r = re + (r — re) with 
re €l andr —re € 1’, we see that R=J+/’. ThereforeR=/ @ I’. 

Now put J; = J. If I’ is not 0, choose a minimal left ideal J, C I’ by the 
minimum condition for left ideals in R. Arguing as in the previous paragraph, we 
have In = Re» for some element e2 with e5 = é). The argument in the previous 
paragraph shows that R = I2 ® I, where J} = {r € R | rez = 0}. Define J” = 
{r € R| rey =rez =0} = 1' N15. Since Jy is contained in J’, we can intersect 
R= 1,6 1, with /’ and obtain J’ = 1) @1". TheanR=1),0l'=) Ohl". 
Continuing in this way, we obtain R = 11640101", etc. As this construction 
continues, we have I’ > I” 5 J" >..--. Since R is left Artinian, this sequence 
must terminate, evidently in 0. Then R is exhibited as the sum of simple left R 
modules and is semisimple. 


Corollary 2.14. If R is aleft Artinian ring, then R/ rad R is a semisimple ring. 


PROOF. Let J = rad R, and let gy : R > R/TI be the quotient homomorphism. 
Arguing by contradiction, let J be a nonzero nilpotent left ideal in R/J, and let 
J= gy! (J) C R. Since J is nilpotent, J* CT for some integer k > 1. But 
I, being the radical, is nilpotent, say with 7’ = 0, and hence J‘+! C J! = 0. 
Therefore J is a nilpotent left ideal in R strictly containing /, in contradiction to 
the maximality of J. We conclude that no such J exists. Then rad(R/rad R) = 0. 
Since R/ rad R is left Artinian as a quotient of a left Artinian ring, Theorem 2.13 
shows that R/rad R is a semisimple ring. 


We shall use this corollary to prove that left Artinian rings are left Noetherian. 
We state the theorem, state and prove a lemma, and then prove the theorem. 


Theorem 2.15 (Hopkins). If R is a left Artinian ring, then R is left Noetherian. 


Lemma 2.16. If R is a semisimple ring, then every unital left R module M@ 
is semisimple. Consequently any unital left R module satisfying the descending 
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chain condition has a composition series and therefore satisfies the ascending 
chain condition. 


PROOF. For each m € M, let Ry» be a copy of the left R module R, and 
define M = Dune mu Rm asa left R module. Since each R,, is semisimple, M is 
semisimple. Define a function ¢ : M — Mas follows: if rm, +++ +1rm, is given 
with rm, in Rm, for each j, let Pm, +--+ + 1m) = ae rmjmj. Then g is an 
R module map with the property that g(1,) = m, and consequently @ carries M 
onto M. As the image of a semisimple R module under an R module map, M is 
semisimple. 

Now suppose that M is a unital left R module satisfying the descending chain 
condition. We have just seen that M is semisimple, and thus we can write 
M= Dies M; as a direct sum over a set S of simple left R modules M;. Let us 
see that S is a finite set. If S were not a finite set, then we could choose an infinite 
sequence 11, i2,... of distinct members of S, and we would obtain 

=) a) ee rae 
eg Mie Mie ; 
in contradiction to the fact that the R submodules of M satisfy the descending 
chain condition. 


PROOF OF THEOREM 2.15. Let J = rad R. Since / is nilpotent, J" = 0 for 
some n. Each I‘ for k > 0 is a left R submodule of R. Since R is left Artinian, 
its left R submodules satisfy the descending chain condition, and the same thing 
is true of the R submodules of each J*. Consequently the R submodules of each 
1‘ /T**" satisfy the descending chain condition. 

In the action of R on J*/I**! on the left, J acts as 0. Hence I‘ /I*+! becomes 
aleft R/J module, and the R/J submodules of this left R/J module must satisfy 
the descending chain condition. Corollary 2.14 shows that R/J = R/rad R is 
a semisimple ring. Since the R/I submodules of [«/I**! satisfy the descend- 
ing chain condition, Lemma 2.16 shows that these R/J submodules satisfy the 
ascending chain condition. Therefore the R submodules of each left R module 
I /1**! satisfy the ascending chain condition. 

We shall show inductively for k > 0 that the R submodules of R/I**! satisfy 
the ascending chain condition. Since J” = 0, this conclusion will establish that 
R is left Noetherian, as required. The case k = 0 was shown in the previous 
paragraph. Assume inductively that the R submodules of R/J* satisfy the 
ascending chain condition. Since R/I* = CE yee) and since the 
R submodules of R/I* and of I*/I**! satisfy the ascending chain condition, the 
same is true for R/J‘*!. This completes the proof. 
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5. Wedderburn’s Main Theorem 


Wedderburn’s Main Theorem is an analog for finite-dimensional associative 
algebras over a field of characteristic 0 of the Levi decomposition of a finite- 
dimensional Lie algebra over a field of characteristic 0. Each of these results says 
that the given algebra is a “semidirect product” of the radical and a semisimple 
subalgebra isomorphic to the quotient of the given algebra by the radical. In other 
words, the whole algebra, as a vector space, is the direct sum of the radical and a 
vector subspace that is closed under multiplication. 

An example of this phenomenon occurs with a block upper-triangular subal- 
gebra A of M,,(D) whenever D is a finite-dimensional division algebra over the 
given field. Let the diagonal blocks be of sizes, ...,n- withny +---+n, =n. 
The radical rad A is the nilpotent ideal of all matrices whose only nonzero entries 
are above and to the right of the diagonal blocks, and the semisimple subalgebra 
consists of all matrices whose only nonzero entries lie within the diagonal blocks. 


Theorem 2.17 (Wedderburn’s Main Theorem). Let A be a finite-dimensional 
associative algebra with identity over a field F of characteristic 0, and let rad A be 
the Wedderburn—Artin radical. Then there exists a subalgebra S of A isomorphic 
as an F algebra to A/rad A such that A = S @ rad A as vector spaces. 


REMARKS. The finite dimensionality implies that A is left Artinian, and 
Corollary 2.14 shows that A/ rad A is a semisimple algebra. The decomposition 
A = S @rad A is different in nature from the one in Theorem 2.2, which involves 
complementary ideals. When there are complementary ideals, the identity of A 
decomposes as the sum of the identities for each summand. Here the identity of 
A is the identity of S and has 0 component in rad A. To see this, write 1 =a+b 
witha € Sand b € rad A. Multiplying 1 = a + b on the left and right by s € S, 
we see that as = s = sa and that bs = sb = 0. Hence a = 1s is the identity of 
S. Then b? = (1 — 1s)? =1—2-15+12=1-—2-1st+1s =1—15 =b, and 
b" = b forall n > 1. Since rad A is nilpotent, b” = 0 for some n. Thus b = 0, 
and 1 = ly as asserted. 


Theorem 2.17 is a deep result, and the proof will occupy all of the present 
section and the next. The key special case to understand occurs when A/ rad A = 
Mn, (F) X +++ x Mn, (F). We shall handle this case by means of Theorem 2.18 
below, whose proof will be the main goal of the present section. Corollary 2.27 (of 
Theorem 2.18) near the end of this section will show that Theorem 2.18 implies 
this special case of Theorem 2.17 for r = 1, and Corollary 2.28 will deduce this 
special case of Theorem 2.17 for general r from Corollary 2.27. 
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Theorem 2.18. Let A be a left Artinian ring with Wedderburn—Artin radical 
rad A,and suppose that A/ rad A is simple, i.e., is of the form A/rad A = M,(D) 
for some division ring D. Then A is isomorphic as a ring to M,,(R) for some left 
Artinian ring R such that R/ rad R = D. 


The idea behind the proof of Theorem 2.18 is to give an abstract characteri- 
zation of a ring of matrices in terms of the elements E;; that are | in the (i, j)™ 
place and are 0 elsewhere. In turn, these elements arise from the diagonal such 
elements E;;, which are idempotents, i.e., have jee = E;;. The critical issue in 
the proof of Theorem 2.18 is to show that each idempotent of A/ rad A, which is 
assumed to be a full matrix ring M,(D), has an idempotent in its preimage in A. 
The lifted idempotents then point to M,,(R) for a certain R. 

Thus we begin with some discussion of idempotents. We shall intersperse 
facts about general rings with facts about left Artinian rings as we go along. For 
the moment let R be any ring with identity, and let e be an idempotent. Then 
1 — e is an idempotent, and we have the three Peirce* decompositions 


R= Re@® RU -—e), 
R=eR@(1—-e)R, 
R=eRe@®eRU—e)@ (1 —e)Re @ Ui — e)RU — 2). 


All the direct sums may be regarded as direct sums of abelian groups. The two 
members of the right side in the first case are left ideals, and the two members of 
the right side in the second case are right ideals. Ifr € R is given, then the first 
decomposition is asr = re + r(1 — e); the decomposition is direct because if 
rjé = r2(1 — e), then right multiplication by e gives rje = 0 since e* =e. The 
second decomposition is proved similarly, and the third decomposition follows 
by combining the first two. In the third decomposition, eRe is a ring with e as 
identity, and (1 — e)R(1 — e) is aring with 1 — e as identity. 


EXAMPLE. Let R = M,,(F), and let 


1 0 
C= , sothat l-e= 
0 1 


“Pronounced “purse.” Charles Sanders Peirce (1839-1914). 
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In block form we then have 


ere=(§ ay eR) =(9 ae 


a-ore=(¢ ae (=o 0 =() ay 


Proposition 2.19. In a ring R with identity, let e be an element of R with 
e? =e. 

(a) If J is a left ideal in eRe, then eRIJ = J. Hence J + RI is a one-one 
inclusion-preserving map of the left ideals of eRe to those of R. 

(b) If J is a two-sided ideal of eRe, then e(RJ R)e = J. Hence J+ RJR 
is a one-one inclusion-preserving map of the two-sided ideals of eRe to those of 
R. This map respects multiplication of ideals. 

(c) If J is a two-sided ideal of R, then eJe is a two-sided ideal of eRe, and 
eReNJ =elJe. 


PROOF. For (a), we have eRI = eR(el) = (eRe)I = TJ, the first equality 
holding because e is the identity in eRe and the third equality holding because 
eRe contains its identity e. The rest of (a) then follows. 

For (b), J satisfies J = eJe, since ej = je = j for every j € eRe, and 
therefore eRJ Re = eReJeRe = (eRe)J (eRe) = J, the last equality holding 
because eRe contains its identity e. To see that J ++ RJR respects multi- 
plication, we compute that (RJR)(RJ'R) = RJRJ'R = R(Je)R(eJ)R = 
RJ(eRe)J'R = RJ J'R. 

For (c), eRen 7 > eJe certainly. In the reverse direction, let j be ineReN J. 
Then j = ere for some r € R, and hence eje = e*re? = ere = j shows that j 
is ineJe. 


Corollary 2.20. Ina left Artinian ring R, let e be an element with e* = e. 
Then the ring eRe is left Artinian, and 


rad(eRe) = eRe Mrad R = e(rad R)e. 


If R denotes the quotient ring R/ rad R and é denotes the element e + rad R of the 
quotient, then the quotient map carries eRe onto eRe and has kernel rad(e Re). 
Consequently 

eRe/rad(eRe) = éRe. 


PROOF. The ring eRe is left Artinian as an immediate consequence of Propo- 
sition 2.19a. For the first display we may assume that R and eRe are both left 
Artinian. Then eRe M rad R is a two-sided ideal of eRe, and (eRe MN rad R)” C 
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(rad R)” for every n. Since (rad R)’ = 0 for some N, eRe M rad R is nilpotent, 
and eReMrad R C rad(eRe). Since the reverse inclusion is evident, we obtain 
rad(eRe) = eRe (rad R. The equality eRe M rad R = e(rad R)e is the special 
case of Proposition 2.19c in which J = rad R. This proves the equalities in the 
first display. 

For the isomorphism in the second display, the quotient mapping carries ere 
toere+radR = (e+rad R)(r + rad R)(e + rad R) = e(r + rad R)e. Thus 
the quotient map R — R carries eRe onto éRé. The kernel is eRe M rad R, 
which we have just proved is rad(eRe). Therefore the quotient map exhibits an 
isomorphism of rings eRe/rad(eRe) = éRé. 


Proposition 2.21. In aring R with identity, let e; and e2 be idempotents. Then 
the unital left R modules Re; and Rez are isomorphic as left R modules if and 
only if there exist elements e12 and e2; in R such that 


€1€12€2 = 12, e2e21€1 = 21, 


€12€21 = @1, €21€12 = 2. 


REMARK. In this case we shall say that e; and e2 are isomorphic idempotents, 
and we shall write e; = e. 


PROOF. Let g : Re; — Rez be an R isomorphism. Define e12 = ¢(e1) 
and e2; = gy '(e2). Every element s of Re has the property that sez = s 
because e5 = €; since 12 lies in Reo, €12€2 = €12. Meanwhile, ej2 = g(e,) = 
per) = e;9(e1) = e1e12. Putting these two facts together gives e127 = e€j2e2 = 
€1€12€2. This proves the first equality in the display, and the equality e2; = 
€2€21€; 1s proved similarly. Also, ey = ¢ '(v(e1)) = g Ne) = 9 Neer) = 
ep ler) = e12e2, and similarly e2;e;2 = e2. This completes the proof that 
an R isomorphism Re, = Re? leads to elements e)7 and e2; such that the four 
displayed identities hold. 

For the converse, suppose that e;2 and e2; exist and satisfy the four displayed 
identities. Define g : Re; — R by g(re1) = rez. To see that this map is well 
defined, suppose that re; = 0; then rej2 = r(e1ei2e2) = (rei)ei2ze2 = O, as 
required. Similarly we can define yw : Rez > R by W(re2) = reg1. Then 


woler) = We12) = Wlei2e2) = e12W (€2) = €12€21 = 1, 


and similarly py (e2) = e2. Since wg and gw are R module homomorphisms, 
each is the identity on its domain. 


Corollary 2.22. Let R be a left Artinian ring. For eachr in R, letr be the coset 
r+radR in R/rad R. If e; and ez are idempotents in R, then e; and e2 are 
isomorphic if and only if e; and é2 are isomorphic. 
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PRooF. If e; and e2 are given as isomorphic in R, let e12 and ez; be as in 
Proposition 2.21, and pass to R/ rad R by the quotient homomorphism to obtain 
elements @;7 and é, that exhibit @; and é as isomorphic idempotents. 

Conversely let €; and é2 be isomorphic idempotents in R/rad R, and use 
Proposition 2.21 to produce elements v7 and uz; in R/rad R such that 


€1Uj2@2 =Uj2, e2U21@; = U2, UjQu21 =e1, 2112 = 2. 


Let v2 and u2; be preimages of #12 and 2; in R. Possibly replacing wu 12 by e1u12e2 
and u2 by e2u21e1, we may assume that eju12e2 = uj2 and e2u21e,; = U2. Our 
construction is such that uj2u2) = e; — z, with z, inrad R and e;z, = z; = Zz) e). 
Since z, is a nilpotent element, 


@-—zj)@rtutgt---+2%) =e 
as soon as ae = 0. Thus we have wj2u2\(e; + 21 Ze oe zi) = e. 
Define e}2 = 42 and e271 = ur (e;} + 21 4 Ze ++++-+ 27). Then it is immediate 
that €12 = 42, @21 = U2, and e12@2; = e;. Also, the equality €\uj2e€2 = Uj2 
implies that e1e12e¢2 = e12, and the equality e2u21e1(e1 + z1 4 be beer tz2iy= 
uri(e; + 21 +2; +++: +24) implies that er¢2;¢e1 = e2) since e1z) = Z1 = Z1e1. 


In view of Proposition 2.21, we are left with checking the value of e21e12. We 
know that @2)@)2 = U2]U12 = 2, and hence €21e12 = e2 —Z2 for some Z2 inrad R. 
Multiplying by e2 on both sides, we see that 


€222 = 22 = 222. (*) 


Now (e21€12)(€21€12) = e21€1€12 = e21€12, and thus (e2 — z2)* = e2 — Z2. 
Expanding out this equality and using (*) gives e2 — 2z2 + 23 = e2 — z2 and 
therefore gives Zz = z2. Hence z5 = z2 for every n > 1. But zz is in rad R, and 
every element of rad R is nilpotent. Thus z2 = 0, and ej2e21 = e) as required. 


The proof of Corollary 2.22 shows a little more than the statement asserts, 
and we shall use this little extra conclusion when we finally get to the proof of 
Theorem 2.18. The extra fact is that any elements uw. and uv; exhibiting e; and 
€2 have lifts to elements e12 and e2; exhibiting e; and e2 as isomorphic. 

The critical step of lifting a single idempotent from A/rad A to A is accom- 
plished by the following proposition. 


Proposition 2.23. Let R be a left Artinian ring. For each r in R, let r be the 
element r + rad R of R/rad R. If a is an element of R such that a is idempotent 
in R/rad R, then there exists an idempotent e in R such that e = a. 
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PROOF. Set b = 1 —a. The elements a and b commute, and ab = a(1 — a) 
maps tod —a* = 0 in R/rad R, since d is idempotent. Therefore ab lies in rad R 
and must satisfy (ab)” = 0 for some n. Since a and b commute, we can apply 
the Binomial Theorem to obtain 


2n 
1=(a+b)" = Yo (aoe, 
k=0 
ft 2n 
Define e= >> (*Ma*ok and f= Yo (M)a-kbk, 
k=0 k=n+1 


Each term of e contains at least the n™ power of a, and each term of b contains at 
least the n power of b. Thus each term of ef contains at least a factor a”b” = 
(ab)"” = 0, and we see that ef = 0. Therefore e = el = e(e+ f) =e? +0 =e’, 
and e is an idempotent. Each term of e except the one for k = 0 contains a factor 
ab, and thus e = a” mod rad R. Since @ is idempotent, a*" =a mod rad R, and 
therefore e = a. 


For the proof of Theorem 2.18, we need to lift an entire matrix ring to obtain a 
matrix ring, and this involves lifting more than a single idempotent. In effect, we 
have to lift compatibly an entire system e;; that behaves like the usual system of 
E;; for matrices. The idea is that if R/rad R is a matrix ring M,(K) with some 
ring of coefficients K , then the i“ and j columns of M,(K) may be described 
compatibly as M,,(K )e;; and M,(K)e;;. Proposition 2.23 allows us to lift é;; 
and é;; to idempotents e;; and e;;, and Corollary 2.22 shows that an isomorphism 
€;; = e;; implies an isomorphism e;; = e;;. The isomorphism gives us elements 
e;; and e;;, and then we can piece these together to form matrices. 

Two idempotents e and f ina ring R with identity are said to be orthogonal 
ifef =0= fe. Suppose that e;,...,e, are mutually orthogonal idempotents 
such that )~"_, e; = 1. Let us see in this case that 


R=Re, @---@ Rey 


as left R modules. In fact, the condition }77_, e; = 1 shows that r = )7i_, re; 
foreachr € R,and thus R = Re, +---+ Rey. Ifr lies in Re; M paEey Re;, then 
r=se;andr =), z;Tiei- Multiplying the first of these equalities on the right 
by e; gives re; = se; = se; =r. Hence the second of these equalities, upon 
multiplication by e;, yields r = re; = )7;4, rieie; = 0. In other words, the sum 
is direct, as asserted. 


Corollary 2.24. Let R be a left Artinian ring. For eachr in R, letr be the coset 
r +rad R in R/rad R. If x and y are orthogonal idempotents in R = R/rad R 
and if e is an idempotent in R with e = x, then there exists an idempotent f in 
R with f = y andef = fe =0. 
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PROOF. By Proposition 2.23 choose an idempotent fo in R with fo = y. Then 
foe has foe = yx = 0. Hence joer is in rad R, and (foe)"*! = 0 for some n. 


Consequently 1 + foe + (foe)” + ( foe)” is a two-sided inverse to 1 — foe. 
Define 
f =(1—e)(1 + foe + (foe)? +--+ + (foe)") fol — foe). 
Then f = (1—x)(y+0+---+0)y(1—0) = (1—x)y = y—xy = y. Moreover, 
fe=(l—e)(1+ foe + (foe)? +--+ + (foe)”) (foe — foe’) =0 


since foe — ae = foe — foe = 0, and 


ef =e(1—e)(1+ foe + (foe)” +--+ + (foe)”) fol — foe) =0 


since e(1 — e) = 0. 
We still need to see that f* = 0. Since fo(1 — foe) = fo(1 —e), we can write 
f=(1—e)0 + foe +--+) fol — e) and 


f? =e) + foe +--+) fol —e)(1 + foe +--+) fol — e) 
= (12) + foes fo = foey (hr soe +++: ) fol) 
=(1—e)(1+ foe +--+) fo-1- fol —e) 
= (l—e)01 + foe +--+) fol — foe) 


as required. 


Corollary 2.25. Let R be a left Artinian ring. For eachr in R, let be the coset 
r+radR in R/radR. If {x1,...,xn} is a finite set of mutually orthogonal 
idempotents in R = R/rad R, then there exists a set of mutually orthogonal 
idempotents {e;,..., ey} in R such that e; = x; for alli. If eae x; = 1, then 
ee ei = 1. 

PROOF. For the existence of {x;,..., Xy}, we proceed by induction on N, the 
case N = 1 being Proposition 2.23. Suppose we have found e;, ..., e, and we 
want to find e,,;. Let e be the idempotent e; + --- + e,, and apply Corollary 
2.24 to the idempotent e in R and the idempotent x,,1 in R/ rad R. The corollary 
gives us é,+1 orthogonal to e with én41 = Xn41. Since e; = ee = ee; fori <n, 
we obtain €n41e; = @n4i(ee;) = (en41e)e; = O and similarly e;e,+; = 0 for 
those i’s, and the induction is complete. 

Finally }>; x; = 1 implies that }°,e; = 1+ 7 for some r in rad R. Then 
the idempotent 1 — 5°; e; is exhibited as in rad R and must be 0 because every 
element of rad R is nilpotent. 
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In a nonzero ring R with identity, a finite subset {ei ha Je {owes n}} is 
called a set of matrix units in R if be, ei = 1 and ejexn = djxei for all 
i, j, k,l. It follows from these conditions that the e;; are mutually orthogonal 
idempotents with sum 1, since ejje;; = d;;e1; = dijei7. In view of the remarks 
before Corollary 2.24, we automatically have R = ri Re;;. In addition, the 
product rule gives CMCIPejj = Cijs Cjjejieii = Cji, Cijeji = ii» and Cjieij = eff 
by Proposition 2.21 the idempotents e;; and e;; are isomorphic in the sense that 
there is a left R module isomorphism Re;; = Re;j. 

If A = M,(R), define E;; to be the matrix that is 1 in the (7, pe place and 
is 0 elsewhere. Then it is immediate that {£;;} is a set of matrix units in A. To 
recognize matrix rings, we prove the following converse. 


Proposition 2.26. For a nonzero ring A with identity, suppose that 


fe li, €{l,...,n}} 


is a set of matrix units in A. Let R be the subring of A of all elements of A 
commuting with all e;;. Then every element of A can be written in one and only 
one way as DE rjjeij With r;; € R for alli and j, and the map A — M,(R) 
given by a +> [r;;] is a ring isomorphism. The ring R can be recovered from A 
by means of the isomorphism R = e;; Aey. 


PROOF. To each a € A, associate the matrix [r;;] in M,(A) whose entries are 
given by rij = >; exiaeje. Then 


tim = >, Cade nein = > CHadduemn = ei Gein: (*) 
k k 


and Cnty = > eimeneek = d Onrendese = en deiw- 
k k 


Thus rjj€im = €1i4€jm = €im’ij- Because of the definition of R, this equality 
shows that r;; is in R. In particular, [r;;] is in M,(R). A special case of (+) is 
that Vijeij = Cijade;;- Hence 


Lerijeij = dL enaejj = lal =a. 
i,j i,j 


This proves that a can be expanded as a = )); ; rij. 
For uniqueness, suppose that a = )°; ; Sijeij 1S given with each s;; in R. 
Multiplication on the left by e;, and right by e,;, followed by addition, gives 


pg = Li Ckpdegk = Li ekp( DL sijeii)egk = YL Sijekpeijegk = L Spqekk = Spq- 
k k i,j ijk k 
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This proves that the map A — M,,(R) is one-one onto. 

To see that the map A — M,(R) respects multiplication, let a and a’ be in 
A, and let the effect of the map on a, a’, and aa’ be a +> [rij], a’ [ri]. and 
aa’ +> [s;;]. Then we have 


Lr = 


if. / / 
> CKiMeTKEK IA Cj! = » CKiGe]]A Cjk = » CKiad Cjk = Sij, 
Lk, Lk k 


and the matrix product of the images of a and a’ coincides with the image of aa’. 

Finally consider the image E,, = [r;j] of the element a = e,; of A. It has 
Vij = op ekierejk = 61161; Y-p Ck = 61151 ;. If a is a general element of A and 
its image is [r;;], then the result of the previous paragraph shows that e1;ae1, 
maps to Ey, [rijJE11 = 111 £11. Hence the map e;;a@e1; + rj; is an isomorphism 
of e;,; Ae; with R. 


PROOF OF THEOREM 2.18. Let {xij | i,j € {l,..., ny} be a set of matrix 
units for the matrix ring A/rad A = M,(D). Then x11,...,Xpn are mutually 
orthogonal idempotents in A/rad A with sum 1. By Corollary 2.25 we can 
choose mutually orthogonal idempotents e11,..., @nn in A with been ej; = 1 
and with é;; = x;;. 

We observed at the time of defining matrix units that x11, ..., Xn are isomor- 
phic as idempotents. Corollary 2.22 shows as a consequence that e11,..., @nn 


are isomorphic as idempotents. The remarks following Corollary 2.22 show that 
the isomorphism of Re; with Re;; can be exhibited by elements e;; and e;; in A 
satisfying the usual properties 


Cneiei = Cli, CCi1e11 = @i1, e1i@i1 = 11, CNeli = Cii 


and also the properties é;; = x,; and é;; = x;,;. Here a is shorthand for a+rad A. 
Define e;; = e;;e,;. Then e;; = e;1@) ; = Xj1%1; = x;;, and we readily check that 
{e;;} is a set of matrix units for A. 

By Proposition 2.26, A = M,(R) with R = e,;Ae,,. From Corollary 2.20 
we know that e;; Ae;;/rad(e,; Ae,;) = €;;(A/rad A)e;,, where @;; denotes the 
element e;; + rad A of A/rad A. Hence 


R/rad R & @,(A/rad A)é1, & 211M, (D)é1; = D, 


and the proof is complete. 


Corollary 2.27. If A is a finite-dimensional algebra with identity over a field 
F andif A/rad A = M,,(F) as algebras, then there is a subalgebra S isomorphic 
to M,,(F) such that A = S @ rad A as vector spaces. 
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REMARKS. This corollary shows that Theorem 2.18 implies Theorem 2.17 
under the additional assumption that the algebra A of Theorem 2.17 satisfies 
A/rad A = M,,(F). It is not necessary to assume characteristic 0. 


PROOF. Suppose that A is a finite-dimensional algebra with identity over 
F such that A/rad A = M,(F). Then A is left Artinian, and Theorem 2.18 
produces a certain ring R with A = M,,(R). Here Proposition 2.26 shows 
that R is isomorphic as a ring to e;; Ae, for a certain idempotent e,; in A. It 
follows that R is an algebra with identity over F’, necessarily finite-dimensional 
because A is finite-dimensional. The algebra R, according to Theorem 2.18, has 
R/radR = F. Therefore R = F @radR as F vector spaces. If we allow 
M,(-) to be defined even for rings without identity, then we have F algebra 
isomorphisms 


A= M,(R) = M,(F @ rad R) = M,(F) © M, (rad R) 


in which the direct sums are understood to be direct sums of vector spaces. We 
shall show that 
rad(M,(R)) = Mn (rad R), (*) 


and then the decomposition A = S @ rad A will have been proved with S = 
M,,(F). 

To prove (*), let E;; be the member of M,(R) that is 1 in the (i, pe place 
and is 0 elsewhere. Suppose that J is a two-sided ideal in M,(R). Let CR 
be the set of all elements x;, for x € J. Ifr is in R, then rE), is a member of 
M,,(R), and the (1, 1)" entry of the element (7 £1;)x of J is rxi1. Thus rx, is 
in J. Similarly x,,r is in J, and J is a two-sided ideal in R. Let us see that 


J = MG). (#x) 


If x isin J, then so is Ej,;xE£,; = x1, £;;, and hence / Fj; is in J; taking sums 
over i and j shows that M@,,(/) © J. In the reverse direction if x is in J, then so 
is E\;x Ej, = x;jE\,, and hence x;; is in /; therefore J C M,(/). This proves 
(««). Let us apply (««) with J = rad(M,,(R)). The corresponding ideal J of R 
consists of all entries x;; of members x of J. Using Corollary 2.20, we obtain 


TE =E\,JE\, = Ej, rad(M,(R)) Ey, = rad( E11 M,(R)F\1) = rad(RE})). 


Thus J = rad R. Taking M,,(-) of both sides and applying (««), we arrive at («). 
This completes the proof. 


Corollary 2.28. If A is a finite-dimensional associative algebra with identity 
overa field F andif A/rad A = M,, (Ff) x---x My, (F), then there is a subalgebra 
S of A isomorphic as an algebra to A/rad A such that A = S @ rad A as vector 
spaces. 
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REMARKS. This corollary gives the conclusion of Theorem 2.17 under the 
additional assumption that the semisimple algebra A/ rad A over F is of the form 
A/radA = M,,(F) x --» x M,,(F). If F is algebraically closed, then the 
division rings D, in Theorem 2.2 are finite-dimensional division algebras over 
F and necessarily equal F’, as was observed in the discussion after Corollary 
2.3. Thus Theorem 2.2 shows that the additional assumption about the form of 
A/rad A is automatically satisfied if F is algebraically closed. In other words, 
Corollary 2.28 completes the proof of Theorem 2.17 if F is algebraically closed. 


PROOF. For 1 < j < r, let x; be the identity matrix of M,,(F) when 
M,,(F) is regarded as a subalgebra of A/rad A. The elements x; are orthogonal 
idempotents in A/rad A with sum 1, and Corollary 2.25 shows that they lift to 
orthogonal idempotents e; of A with sum |. For each j , Corollary 2.20 shows that 
e;Ae;/rad(e; Ae;) = xj(A/rad A)x; = M,,(F). By Corollary 2.27, e; Ae; has 
a subalgebra S; = M,,(F') with e; Ae; = S; @ rad(e; Ae;) as vector spaces. Put 
S= Di-1 S;, the direct sum being understood in the sense of vector spaces. The 
subalgebra S; has identity e;, and the product of e; with any other S; is 0 because 


eve; = ee; =Owheni 4 j. Ifs = )7, 5; and . = )7/ 9; are two cone of S, 


then ss’ = (7, s:¢;)( 52; €)84) = Do; srereys) = DL, syeys) = DL; 8y8)- Hence 
S is a subalgebra. The element a e; is a two-sided identity in S. 

Let us prove that SNrad A = 0. If s = 7; 5; is in SNrad A, then s; = ejse; is 
in S$; = e;Se; and is in e;(rad A)e;, which equals rad(e; Ae;) by Corollary 2.20. 
Since S; Orad(e; Ae;) = 0 by construction, s; = 0. Thus s = i sj =0. 

eee SArad A = 0. Acount of dimensions gives dim S = pap dim S$; 
— =) n= = dim(A/rad A). Thus dim A = dim S+dim(rad A), and we conclude 
that A = S @ rad A as vector spaces. 
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In this section we shall complete the proof of Wedderburn’s Main Theorem 
(Theorem 2.17). In the previous section we proved in Corollary 2.28 the special 
case in which A/rad A is isomorphic to a product of full matrix rings over the 
base field F'. This special case includes all cases of Theorem 2.17 in which F is 
algebraically closed. 

The idea for the general case is to make a change of rings by tensoring A with 
the algebraic closure of the underlying field F,, or at least with a large enough 
finite extension K of F for Corollary 2.28 to be applicable. That is, we first 
consider Ax = A@ pK and (A/rad A) @f K inplace of A and A/rad A. Inside 
Ax we can recognize (rad A) ®p K as a subalgebra defined over K, and we 
expect that it is rad Ax and that we can find a complementary subalgebra S over 
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K;; then the question is one of showing that S is of the form So ®r K for some 
semisimple subalgebra So of A defined over F. The trouble with this style of 
argument is that the tensor product (A/ rad A) @ K need not be semisimple and 
there need not be a candidate for $. Some question about separability of field 
extensions plays a role, as the following example shows, and the assumption of 
characteristic 0 will ensure this separability. 


EXAMPLE. We exhibit two extension fields K and L of a base field F such that 
K @ r L is not asemisimple algebra over F’. The field extensions are each 1-by-1 
matrix algebras over an extension field of F and hence are simple algebras, yet 
the tensor product is not semisimple. Fix a prime field F,, and let F = F,,(x?) be 
a simple transcendental extension of F,. Define K = L = F,(x) = F( R/xP ), 
Both K and L are field extensions of F of degree p. Thus K @- L is a finite- 
dimensional commutative algebra with identity over F', by the construction in 
Proposition 10.24 of Basic Algebra. The element z= x ®1—1@xinK @FL 
is nonzero but has z? = x? @1—1@x? =x? @1—x? @1 =O, the next-to-last 
equality following because x? lies in the base field F. Consequently K @- L has 
a nonzero nilpotent element. If K ®r L were semisimple, Theorem 2.2 would 
show that it was the direct product of fields, and it could not have any nonzero 
nilpotent elements. We conclude that K @- L is not a semisimple algebra. 


Proposition 2.29. Let F be a field, let K = F(a) be a simple algebraic 
extension, let g(X) be the minimal polynomial of « over F’, and let L be another 
field extension of F'. Then 


(a) K @r L = L[X]/(g(X)) as associative algebras over L, 
(b) K @F L is a semisimple algebra if the polynomial g(X) is separable. 


REMARKS. Proposition 10.24 of Basic Algebra shows that the tensor product 
A @f B of two associative algebras with identity over F has a unique associative 
algebra structure such that (a, ® b1)(az ® bz) = ajd2 ® by bo. Problem 8 at the 
end of Chapter X shows that if B is an extension field of F’, then A @  B is in fact 
an associative algebra with identity over B, the multiplication by b € B being 
given by the mapping 1 ® (left by b). 


PROOF. For (a), let = [K : F']. Formthe F bilinear mapping of F[X]x L into 
L[X] given by (P(X), £) > €P(X). Corresponding to this F bilinear mapping 
is aunique F linear map g : F[X] @r L > L[X] carrying P(X) @ £ to P(X) 
for P(X) € F[X]andé € L. The F vector space F[X]®- Lisan L vector space 
with multiplication by £9 € L given by the linear mapping 1 @ (left by £9). Since 
g( ® (left by &9))(P(X) @2£)) = lol P(X) = Cog(P(X) @£)), gis L linear. In 
addition, p((P(X)@£)(Q(X)@l')) = p(P(X)Q(X)@LE) = P(X) Q(X) = 
o(P(X) @ £)¢(Q(X) @ £’), and therefore y is an algebra homomorphism. 
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We follow ¢ with the quotient homomorphism w : L[X] — L[X]/(g(X)), 
and the composition y¢ is 0 on the ideal (g(x)) @r L of F[X] @F L. Therefore 
wg descends to a homomorphism (F[X]/(g(X))) @r L > L[X]/(g(X)), hence 
to ahomomorphism 7 : K @r L > L[X]/(g(X)). Since g and y are onto, so 
is 7. 

It is enough to prove that 7 is one-one. Thus suppose that n( ki ® ti) =0 
with all k; in K, all £; in L, and the @; linearly independent over F. Write 
kj = P;(X)+(g(X)) with deg P;(X) <n whenever P; 4 0. Then >>; ¢; P(X) = 
0 mod g(X). Since g(X) has degree n and each nonzero P;(X) has degree at 
most n, >); €;Pi(X) = 0. Write P(X) = eF cijX/ with each cj; in F. Then 
yj ea licij)X! = 0, and )°; €ic;; = 0 for all j. Since the é; are linearly 
independent over F’,c;; = 0 for alli and j. Thus k; = 0 foralli, >>, k; @¢; =0, 
and 7 is one-one. This proves (a). 

For (b), factor g(X) over L as g1(X)--+ gm(X) for polynomials g;(X) irre- 
ducible over L. Since the separability of g forces g1,..., 8m to be relatively 
prime in pairs, the Chinese Remainder Theorem implies that 


L[X]/(1(X) > ++ 8m(X)) = L[X]/(g1(X)) x +++ x LEXT/(8m(X)). 


Each L[X]/(g;(X)) is a field, and thus L[X]/(g(X)) is exhibited as a product of 
fields and is semisimple. 


Corollary 2.30. Let F bea field, let K be a finite separable algebraic extension 
of F, and let L be another field extension of F. Then the algebra K @F L is 
semisimple. 


REMARKS. The condition of separability of the extension K/F is automatic 
in characteristic 0. The two field extensions K and L in the example before 
Proposition 2.29 both failed to be separable extensions of the base field F’. 


PROOF. The Theorem of the Primitive Element (Theorem 9.34 of Basic Al- 
gebra) shows that K/F is a simple extension, say with K = F(a). Since this 
extension is assumed separable, the minimal polynomial over F of any element of 
K is a separable polynomial. The hypotheses of Proposition 2.29b are therefore 
satisfied, and K @ - L is semisimple. 


Proposition 2.31. Suppose that A and B are algebras with identity over a field 
F, that B is simple, and that B has center F. Then the two-sided ideals of the 
tensor-product algebra A ®  B are all subsets J ®r B such that J is a two-sided 
ideal of A. 


PROOF. The set J] ®; B is atwo-sided ideal of A®@- B, since (a@b)(i@b'’) = 
ai ® bb’ and since a similar identity applies to multiplication in the other order. 
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Conversely suppose that J is an ideal in A @f B. Let 1, be the identity of B, 
and define J = {a € A| a@ 1g € J}. Then J is a two-sided ideal of A, and we 
shall prove that J = I @p B. The easy inclusion is J ®- B C J. For this, let 
ibe in J and b bein B. Theni @ 1g isin J andl, ®bDisin A @, B. Their 
product i @ b has to be in J, and thus 7 @r BC J. 

For the reverse inclusion, take a basis {x;} of J over F and extend it to a basis 
of A by adjoining some vectors {y;}. It is enough to show that any finite sum 
>»; ¥j ® 5; in J necessarily has all b; equal to 0. Arguing by contradiction, 
suppose that )°7"_, y;, ® bj, is a nonzero sum in J with m as small as possible 
and in particular with all b;, nonzero. Let H be the subset of B defined by 


H= fer 


dX yj, © cj € J for some m-tuple {c;,} C BI. 
k=1 


The set H is a two-sided ideal of B containing the nonzero element b;, of B. 
Since B is simple by assumption, H = B. Thus 1, is in H. Therefore some 
element 


m 
Yj @ Lagat Py Vix @ Cjp 
k=2 


isin J. Letb € B be arbitrary. Multiplying the displayed element on the left and 
right by 1,4 @ b and subtracting the results shows that 


Vip @ (bej, = Cb) H+ ++ + Vin @ (BCj,, = Cj,D) 


is in J. Since m was chosen to be minimal, this element must be 0 for all choices 
of b. Then all coefficients are 0, and the conclusion is that all coefficients c;, are 
in the center of B, which is F by assumption. Consequently we can rewrite our 
element of J as 


m m 
Vj, Ql e+ D> Vj, BC =p @l at DV C4, @lB = (MatepVit: + +Cin Vin) @Le- 
k=2 k=2 


The definition of J shows that the factor yj, +c yj +++ -+c;, yj, in the pure tensor 
on the right is in 7. Since the y;’s form a basis of a vector-space complement to 
I, this vector must be 0. The linear independence of the y;’s over F forces each 
coefficient to be 0, and we have arrived at a contradiction because the coefficient 
of y;, is 1, not 0. 


Lemma 2.32. The center of a finite-dimensional simple algebra A over a field 
F is a field that is a finite extension of F’. 


PROOF. By Theorem 2.4, A = M,,(D) for some finite-dimensional division 
algebra D over F.. Let Z be the center of A. By inspection this consists of the 
scalar matrices whose entries lie in the center of D. The center of D is a field. 
Hence Z is a field, necessarily a finite extension of F’. 
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Proposition 2.33. Let A be a finite-dimensional semisimple algebra over a 
field F of characteristic 0, and suppose that K is a field containing F’. Then the 
algebra A @p K over K is semisimple. 


PROOF. Since the tensor product of a finite direct sum is the direct sum of tensor 
products, we may assume without loss of generality that A is simple. Lemma 2.32 
shows that the center Z of A is a finite extension field of F. By Corollary 2.30 
and the assumption that F has characteristic 0, the algebra Z ® K is semisimple. 
Being commutative, it must be of the form K; ®--- @ K, with each ideal K; 
equal to a field, by Theorem 2.2. 

Each ideal K; is aunital Z @- K module, hence is both a unital Z module and 
a unital K module. Thus we can regard each K; as an extension field of Z or of 
K,, whichever we choose. First let us regard K; as an extension field of Z. Since 
K; has no nontrivial ideals and A has center Z, Proposition 2.31 shows that the 
Z algebra A @z K; is simple as aring. 

Next let us regard K; as an extension field of K; since A is finite-dimensional 
over F,so is Z. Therefore Z ®; K is finite-dimensional over K, and K; is a 
finite extension of K. Hence A @z K; is a finite-dimensional algebra over K , 
and it is left Artinian as a ring. 

By Theorem 2.6, any left Artinian simple ring such as A @z K; is neces- 
sarily semisimple. Using the associativity formula for tensor products given in 
Proposition 10.22 of Basic Algebra, we obtain an isomorphism of rings 


A@r K =(A@zZ)@r K=A®Qz (ZO K) 


=A@z(Ki9::-@ Ks) = @ (A @z Kj), 


j=l 
the summands being two-sided ideals in each case. Since each A @z Kj; is a 
finite-dimensional simple algebra over K , A @ K is a semisimple algebra over 
K by Theorem 2.4. 


Let us digress for a moment, returning in Lemma 2.34 to the argument that 
leads to the proof of Theorem 2.17. In the next section we shall want to know 
circumstances under which we can draw the same conclusion as in Proposition 
2.33 without assuming that the characteristic is 0. Write the finite-dimensional 
semisimple algebra A as A = M,,(D1) x --- X My,(D,), where each D, is a 
division algebra over F. Let Z;,..., Z- be the respective centers of the simple 
factors of A. Lemma 2.32 observes that each Z; is a finite extension field of F. 
The proof of Proposition 2.33 appealed to Corollary 2.30 to conclude from the 
condition characteristic 0 that Z; ®- K is semisimple. Instead, by rereading the 
statement of Corollary 2.30, we see that it would have been enough for each Z; to 
be a finite separable field extension of F, even if F did not have characteristic 0. 
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Then the rest of the above proof goes through without change. Accordingly we 
define a finite-dimensional semisimple algebra A over a field F to be a separable 
semisimple algebra if the center of each simple component of A is a separable 
extension field of F’. In terms of this definition, we obtain the following improved 
version of Proposition 2.33. 


Proposition 2.33’. Let A be a finite-dimensional separable semisimple algebra 
over afield F ,and suppose that K isa field containing F'. Then the algebra A@ K 
over K is semisimple. 


Lemma 2.34. Suppose that A is a finite-dimensional algebra with identity 
over a field F, and suppose that N is a nilpotent two-sided ideal of A such that 
the algebra A/N is semisimple. Then N = rad A. 


PROOF. The algebra A is left Artinian, being finite-dimensional. Since N 
is nilpotent, we must have N C rad A. The two-sided ideal (rad A)/N of the 
semisimple algebra A/N is nilpotent and hence must be 0. Therefore N = rad A. 


PROOF OF THEOREM 2.17. Let A be the given finite-dimensional algebra of the 
field F of characteristic 0, and write N forrad A and A for A/N. For any extension 
field K of F, we write Ax = A@r K,Nx = N @¢ K,and Ax = A@,- K. 

For most of the proof, we shall treat the special case that N? = 0. Let 
F be an algebraic closure of F. Then Ay = A@pr F = (A/N) @r F = 
(A @r F)/(N @¢ F) = Az/N-. Proposition 2.33 shows that Az = A @r F is 
a semisimple algebra over F’, and the claim is that the two-sided ideal Nz of Az 
is nilpotent. In fact, any element of N= is a finite sum of the form )°; (a; ® ci) 
with each a; in N and each c; in F. The product of this element with )~ F (a; ® on) 
is ae (aia; ® cic), and this is 0 because the assumption N? = 0 implies that 
aja’ = 0 for alli and j. Thus Nz = 0, and N- is nilpotent. 

Since A;/N-F is semisimple and Nj is nilpotent, Lemma 2.34 shows that 
Nf = rad(AF). Corollary 2.28 (a special case of Theorem 2.17) is applicable to 
A; because F is algebraically closed, and it follows that there exists a subalgebra 
S of A; such that Az = S@ NF as vector spaces. Here Sisa product of finitely 
many algebras M,, (F). The embedded matrix units e; 7; Of S obtained from each 
Mp, (F) are members of AF = A@Fr F and hence are of the form Se AO cy, 
where {x;}/_, is a vector-space basis of A over F and each c; is in F. Only finitely 
many such c;’s are needed to handle all e;;’s, and we let K be a finite extension 
of F within F containing all of them. Let po = 1, p1,..., Ps be a vector-space 
basis of K over F. 
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Relative to this K, we form Ax, Nx, and Ax as in the first paragraph of the 
proof. The same argument as with F shows that Ax = Ax /Nx is semisimple 
and that Nx is nilpotent. By Lemma 2.34, Nx = rad Ax. The formulas for the 
e;;'S in the previous paragraph are valid in Ax and give us a system of matrix 
units. As in the previous paragraph, Corollary 2.28 produces a subalgebra S$ of 
Ax isomorphic to some M,,(K) x --- x M,,(K) such that Ax = S @ Nx as 
vector spaces. 

In the basis {x;}?_, of A over F, we may assume that the first ¢ vectors form 
a basis of N = rad A and the remaining vectors form a basis of a vector-space 
complement to N. We identify members a of A with members a @ 1 of Ax. 
With this identification in force, we decompose each basis vector x; fori > t 
according to Ax = S@ Nx as x; = y; — z with y; € S and z; € Nx. Since the 
x;’s fori < tarein N C Nx, the vectors y; with i > t form a vector-space basis 
of S over K. Fori > t, write z; = Yj=0 zi; ® o; with z;; in N. Then we have 


AY 
Y=HM+U=—Oit+Z0)+ Dz; @pe; fori >t. 
j=l 


Put 


AY 
Ke xe EZ and = z= Dz Op; fori >t. 
Then {x; fit ee es ee 141 1S a basis of A over F. We shall show that So = 
141 Fx; is a subalgebra of A, and then A = So ® N will be the required 
decomposition. 
Let x; and x; be given withi > ¢ and j > t, and write 


5 a ; = yy Vai jXy + Vij with Vkij © F and im N. 


Substituting X =yj- Zi and taking into account that Nx is an ideal in Ax, we 
have 
Vidj => es Veij Xp mod Nx — x Vkij Yk mod Nx. 
k k 


Then yiyj = >o, Yeijye + ui; with each uj; € Nx. Since the y; are in S and S 
is a subalgebra, u;; = 0. Thus yiy; = >>, Yeijyx. Let us resubstitute into this 
equality from y; = x; + z;. Taking into account that z;z; = 0 because Nz =0, 
we obtain 

xX + xjz) + XE = = Le Meiike + Lo MeiiZe- 


Substituting from z; = ae Zi; ® p; gives 


AY Ss AY 
xx; @ 1+ D. ¥iEit @ pit i zi1X, @ pr = X VeijX;, @ 1+ ud VkijZk ® P- 
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The coefficients of o9 = 1 must be equal, and therefore 
x; ; = eit 
k 


This equation shows that Sp is a subalgebra and completes the proof under the 
hypothesis that N? = 0. 

Now we drop the assumption that N? = 0. We shall prove the theorem 
by induction on dim y A, the base cases of the induction being dimr A = 0 
and dimr A = 1, for which the theorem is immediate by inspection. For the 
inductive case, let A be given, and assume the theorem to be known for algebras of 
dimension < dim, A. If N? = 0,then we are done. Thus we may assume that the 
product ideal NV * isnonzero and therefore that dime (A/N*) < dim p A. The First 
Isomorphism Theorem shows that (A/N*) /(N/N?) = A/N = A. The quotient 
A/N is semisimple, and N/N? is a nilpotent ideal in A/N*. By Lemma 2.34, 
N/N? = rad(A/N7). The inductive hypothesis gives A/N = S;/N* ® N/N? 
for a subalgebra S; of A with S; D> N*. This means that A = S, + N and 
S,;ON = N’. Here 


dime A = dimr(S, + N) = dime S; + dime N — dimr(S, NN) 
= dime S, + dimp N — dime N? = dime S, + dimp(N/N?), 


and N/N? £0 implies dimp S; < dimpg A. The Second Isomorphism Theorem 
gives A/N = (S}+N)/N = S,/(S; AN) = S;/N*. Thus S,/N? is semisimple. 
Since N? is nilpotent, Lemma 2.34 shows that N* = rad S,. The inductive 
hypothesis gives S; = S @ N* for a semisimple subalgebra S$. Substituting 
into A = S, + N, we obtain A = (S @ N?) +N = S+N. Meanwhile, 
SOAN= (SAOS)AN=SN(S{|NN)=SON? =0. Therefore A= SON, 
and the induction is complete. 
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In this section we begin an investigation of division algebras that are finite- 
dimensional over a given field F. A nonzero algebra A with identity over a field 
F will be called central if the center of A consists exactly of the scalar multiples 
of the identity, i.e., if center(A) = F. Of special interest will be algebras with 
identity that are central simple, i.e., are both central and simple. 


Lemma 2.35. Let A and B be algebras with identity over a field Ff, and 
suppose that B is central. Then 


(a) the members of A ®- B commuting with | @ B are the members of A@ 1, 
(b) center(A ®- B) = (center A) @- 1. 
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PROOF. For (a), suppose that z = )); a; ® b; commutes with 1 ® B and that 
the a; are linearly independent over F. If b is in B, then 


0=(1 @b)z—-z1 @b) = da; @ (bb; — b;b), 


from which it follows that bb; — b;b = 0 for all b and all i. Since B is central, 
each b; is in F,, and we can write z as 


z= La @bi = Y (aibi @ 1) = (Labi) @ 1. 


In other words, z is of the form z =a @ 1. 

For (b), we need to prove the inclusion C. Thus let z be in center(A @ F B). 
By (a), z is of the form z = a @ 1 for some a € A. Now suppose that a’ is in A. 
Then 0 = (a’ @ 1)z — z(a’ ® 1) = (a — aa’) @ 1. Hence a'a = aa’, and we 
conclude that a is in center(A). 


Proposition 2.36. Let A and B be algebras with identity over a field F, and 
suppose that B is central simple. Then 
(a) A simple implies A ®; B simple, 
(b) A central simple implies A @- B central simple. 


PROOF. For (a), Proposition 2.31 shows that any two-sided ideal of A @  B is 
of the form J @r B for some two-sided ideal J of A. Since A is assumed simple, 
the only /’s are 0 and A. Thus the only ideals in A ®f B are 0 and A @- B, and 
A @r B is simple. 

For (b), conclusion (a) shows that A @- B is simple. By Lemma 2.35b the 
center of A @- B is (center A) @ 1 = F1@1= F(1 @1),and hence A @p B 
is central. 


Corollary 2.37. If A and B are finite-dimensional semisimple algebras over a 
field F and at least one of them is separable over F,, then A @- B is semisimple. 


REMARK. The definition of separability of A or B appears between Proposition 
2.33 and Proposition 2.33’. 


PROOF. Without loss of generality, we may assume that A and B are simple. 
For definiteness let us say that A is the given separable algebra over F. Let 
K =center(B). Lemma 2.32 shows that K is a field, and associativity of tensor 
products allows us to write 


A@rB=A@Fr(K @x B)=(A@er K) @x B. 


Here A ® ¢ K is semisimple by Proposition 2.33’, and B is central simple over 
K. Thus Proposition 2.36a applies and shows that (A @r K) @x B is simple. 
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Corollary 2.38. Let A be a central simple algebra of finite dimension n over 
a field F, and let A° be the opposite algebra. Then A @pr A° = M,(F). 


EXAMPLE. Take F = R and A = H, the algebra of quaternions. Then 
conjugation, with 1 +> 1 andi, j,k +» —i, —j, —k, is an antiautomorphism of 
HH. Consequently H° = H. Thecorollary says in this case that H@pH = M,(R). 


PROOF. Let V be A considered as a vector space. For each ag € A, we associate 
the members /(ag) and r (ao) of Ende (V) given by [(ag)a = aoa and r(ap)a = 
aay. Then I(aja5) = [(ao)l(aj) and r(aoaj) = r(ap)r(do), and it follows 
that? : A > Endpg(V) andr : A°® — End-(V) are algebra homomorphisms 
sending | to 1. 

Meanwhile, the map A x A° — Endr(V) given by (a, a’) + I(a)r(a’) is F 
bilinear and extends to an F linear map g : A®@r A° — Endr(V). Because of the 
homomorphism properties of / andr , the mapping ¢ is an algebra homomorphism 
sending | to 1. Proposition 2.36 shows that A ®  A® is simple, and it follows 
that g is one-one. Since dimp(A @p A°) = (dimp A)* = dimy End (V), ¢ is 
onto. 


Corollary 2.39. Let A be a central simple algebra of finite dimension d over 
a field F. Then d is the square of an integer. 


PRooF. Let F be an algebraic closure of F. Proposition 2.36a shows that 
the algebra F @ - A is simple, and its dimension over F is d. A simple finite- 
dimensional algebra over an algebraically closed field is a full matrix algebra over 
that field, and thus F @- A = M,(F). Comparing dimensions over F’,, we see 
that d =n’. 


Corollary 2.40. If D is a division algebra finite-dimensional over its center 
F ,then dimr D is the square of an integer. 


PROOF. The algebra D is central simple over its center F’, and the result is 
immediate from Corollary 2.39. 


Theorem 2.41 (Skolem—Noether Theorem). Let A be a finite-dimensional 
central simple algebra over the field F, and let B be any simple algebra over F’. 
Suppose that f and g are F algebra homomorphisms of B into A carrying the 
identity to the identity. Then there exists an x € A with f(b) = x g(b)x~! for all 
bin B. 


PROOF. Let us observe that the homomorphisms f and g are one-one because 
B is simple, and the finite dimensionality of A therefore forces B to be finite- 
dimensional. 
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We consider first the special case that A = M,(F’) for some n. The homomor- 
phism f makes the space F” of column vectors into a unital left B module by 
the definition by = f(b)v, and similarly the homomorphism g makes F” into a 
unital left B module. Since B is finite-dimensional and simple, an argument given 
with Example 1 of semisimple rings in Section 2 shows that there is only one 
simple left B module up to isomorphism and that every unital left B module is a 
direct sum of copies of this simple left B module. Consequently the isomorphism 
classes of the B modules determined by f and g depend only on their dimension. 
The dimension is n in both cases, and hence there exists an invertible F' linear 
map L: F” + F” such that Lf (b)v = g(b)Lv forall v € F”. If L is given by 
the matrix x~! in M,(F), then x! f(b) = g(b)x7!, and the theorem is therefore 
proved in this special case. 

For the general case we form the tensor products B @r A° and A @f A®°. The 
maps f ® 1 and g @ 1 are F algebra homomorphisms between these algebras, 
B @r A°® is simple by Proposition 2.36a, and Corollary 2.38 shows that A @- A® 
is isomorphic to M,,(F’) for the integer n = dim A. The special case is applicable, 
and we obtain an invertible element X of A @- A® such that 


(f ®1)(b@a°)= X(g @ Ib @a°)X~! forallb € Banda’ € A®. (x) 


Taking b = 1, we see that 1 @ a? = X(1 @a®)X™! for all a® € A°. By Lemma 
2.35a, X liesin A®@1,hence is of the form X = x @1 for some x in A. Substituting 
for X in (*), we obtain f(b) = xg(b)x7~! as required. 


Corollary 2.42. If A is a finite-dimensional central simple algebra over the 
field F, then every F automorphism of A is inner in the sense of being given by 
conjugation by an invertible element of A. 


PROOF. This is the special cse of Theorem 2.41 in which B = A and g is the 
identity map on B. 


8. Double Centralizer Theorem 


We saw in Corollary 2.40 that if D is a division algebra finite-dimensional over 
its center F, then dimy D is the square of an integer. In this section we shall 
prove a theorem from which we can conclude that the positive integer of which 
dim, D is the square is the dimension of any maximal subfield of D. We state the 
theorem, establish two lemmas, prove the theorem, and then derive two corollaries 
concerning maximal subfields of division algebras. 

If A is an algebra with identity and B is a subalgebra containing the identity, 
then the centralizer of B in A is the subalgebra of all members of A commuting 
with every element of B. 
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Theorem 2.43 (Double Centralizer Theorem). Let A be a finite-dimensional 
central simple algebra over a field F,, let B be a simple subalgebra of A, and let 
C be the centralizer of B in A. Then C is simple, B is the centralizer of C in A, 
and (dimy B)(dimy C) = dim, A. 


Lemma 2.44. Let A and A’ be algebras with identity over a field F’, let B and 
B' be subalgebras of them, and let C and C’ be the centralizers of B and B’in A 
and A’, respectively. Then the centralizer of B @r B’ in A @p A’isC @r C’. 


Proor. Expand an element of A @ A’ for the moment as x = }°; a; @a‘ with 
the elements a’ linearly independent over F’. If x satisfies x(b ® 1) = (b @ 1)x 
for all b in B, then )°; (ajb — ba;) ® a; = 0. Since the a‘’s are independent, 
a;b — ba; = 0 for all i, and each a; is in C. Thus the centralizer of B ® | is 
Cc @Fr A’. 

Rewriting x with the a;’s assumed independent, we see similarly that the 
centralizer of 1 @- B’ is A@- C’. Putting these conclusions together, we see that 


centralizer(B @ B’) © centralizer(B @- 1) Ncentralizer(1 @- B’) 
=(C@rpA)N(A@rFC)=CO@rC’. 


The reverse inclusion, namely centralizer(B ®r B’) D C @r C’, is immediate, 
and the lemma follows. 


Lemma 2.45. Let B be a finite-dimensional simple algebra over a field F , and 
write V for the algebra B considered as a vector space. For b in B and v in V, 
define members /(b) and r(b) of Endr(V) by /(b)v = bu andr (b)v = vb. Then 
the centralizer in End; (V) of /(B) is r(B). 


PROOF. Let K be the center of B. This is an extension field of F by Lemma 
2.32, and B is central simple over K. Let us see that any member a of Endr(V) 
that centralizes /(B) is actually in Endx(V). If c is in K, then c is in particular 
in B, and therefore al(c) = I(c)a. Applying this equality to v € V yields 
a(cv) = ca(v), and this equality for all c € K says that a is in Endg (V). 

Thus it is enough to show that the centralizer of /(B) in Endx(V) is r(B). 
We argue as in the proof of Corollary 2.38: The definitions of / and r make V 
into a unital left B module and a unital right B module, and the members of K 
operate consistently on either side of V because K lies in the center of B. The 
function (b, b’) +> 1(b)r (b’) is therefore K bilinear, and it extends to the tensor 
product B @x B° as an algebra homomorphism g : B @x B° +> Endx(V). The 
homomorphism ¢ is one-one, since Proposition 2.36a shows B® x B° to be simple. 
The dimensional equality dimx(B @x B°) = (dimx B)* = dimx (End (V)) 
allows us to conclude that ¢ is onto, hence is an isomorphism. 
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Lemma 2.35a shows that the centralizer of B ®x 1 in B @x B° is 1 @x B®. 
If this statement is translated from the context of B @x B° into the isomorphic 
context of Endx(V), then the centralizer of /(B) in Endx(V) is r(B), and we 
saw that this fact is sufficient to imply the lemma. 


PROOF OF THEOREM 2.43. Let V be the algebra B considered as a vector 
space over F’, and let /(B) and r(B) be the sets of those members of End-(V) 
that are given by left multiplication and right multiplication by members of B. 
The algebra A is central simple by assumption, and End; (V) is central simple, 
being isomorphic to M,,(F’) for the integer n = dimr(V). By Proposition 2.36b, 
A @r End;(V) is central simple. We define two algebra homomorphisms f and 
g of B into A @f Endr(V) by f(b) =1(b) @ 1 and g(b) = 1 @1(b). 

The Skolem—Noether Theorem (Theorem 2.41) produces an element x of 
A @¢ Endf(V) with f(b) = xg(b)x~! for all b € B. Hence 


B@rl=x(1 @fl(B))x7!. (x) 


Lemma 2.44 shows that the centralizer of B @r 1 in A @f End-f(V) is 
C @r Endf(V) and that the centralizer of 1 ®r /(B) is A @®r r(B). From 
the latter identification the centralizer of x(1 @- 1(B))x~! is x(A @prr(B))x7!. 
Combining (*) with these computations of centralizers, we see that 


C @p Ende (V) = x(A @rr(B))x |. (4) 
The algebra A ®r(B) is isomorphic to A ®@ ¢ B°, which is simple by Proposition 


2.36a. Therefore C ®- End, (V) is simple, and C has to be simple. 
Equating the dimensions of the two sides of (**) gives 


(dimp C)(dimp B)? = (dime C)(dimp Endy (V)) = dimp(C @ Endp(V)) 
= dimr (A @r r(B)) = (dime A)(dim B), 


and hence 


Finally the centralizer D of C contains B, and two applications of the dimensional 
equality gives 


(dimp D)(dimp C) = dime A = (dim C) (dim, B). 


Thus dimy D = dim; B, and we must have D = B. In other words, B is the 
centralizer of C. 
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Corollary 2.46. Let D be a central finite-dimensional division algebra over 
the field F. If K is any maximal subfield of D, then dimr D = (dimr K \*, 


PROOF. Apply the Double Centralizer Theorem (Theorem 2.43) with A = 
D. Let Z(K) be the centralizer of the simple subalgebra K in D. Since K is 
commutative, K C Z(K). If a isin Z(K) but not K, then K (a) is a field in D 
properly containing K, in contradiction to the assumption that K is a maximal 
subfield of D. Hence K = Z(K). The dimensional equality in the theorem 
therefore gives dimr D = (dimr K)(dimr Z(K)) = (dimr K)?. 


Corollary 2.47. Let A be a finite-dimensional central simple algebra over a 
field F, and let K be a subfield of A. Then the following are equivalent: 
(a) K is its own centralizer, 
(b) dimy A = (dime K)’, 
(c) K isa maximal commutative subalgebra of A. 


PROOF. Let Z(K) be the centralizer of K in A. The Double Centralizer 
Theorem (Theorem 2.43) gives the equality 


dimp A = (dimp K)(dimp Z(K)). (x) 


If (a) holds, then Z(K) = K, and (x) yields (b). 
If (b) holds, then («) and the equality dimy A = (dim K)? together imply 
that dimr Z(K) = dimr K. Since K is commutative, Z(K) > K. The equality 
of dimensions implies that Z(K) = K, and then (c) follows. 
If (c) holds, we start from the inclusion K C Z(K). If x is in Z(K) but 
not K, then K (x) is a field strictly larger than K, in contradiction to (c). Thus 
K = Z(K), and (a) holds. 


9. Wedderburn’s Theorem about Finite Division Rings 
The theorem of this section is as follows. 
Theorem 2.48 (Wedderburn). Every finite division ring is a field. 
The proof will be preceded by a lemma. 


Lemma 2.49. If G is a finite group and H is a proper subgroup, then 
Pere gHg~! does not exhaust G. 
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PROOF. In the union ) geG 8H g_!,the terms corresponding to g and to gh, for 
h in H, are the same because (gh) H(gh)~! = g(hHh7!)g—! = gHg—!. Thus 
the union can be rewritten as |) eu 8H g_!, it being understood that only one g is 
used from each coset gH. From this rewritten form of the union, we see that the 
number of elements other than the identity in the union is 


<[G:H](H|—1)=[G: H]|H|—[G: H]=|G|—[G: AH] <|G|—1, 


and the lemma follows. 


PROOF OF THEOREM 2.48. Let D be a finite division ring, and let F be the 
center. Then F is a field, say of g elements. Maximal subfields of D certainly 
exist. Any such subfield K has dimp D = (dim K) by Corollary 2.46, and 
hence any two such subfields K and K’ are isomorphic. The Skolem—Noether 
Theorem (Theorem 2.41) shows that K’ = xKx7~! for some invertible x in the 
group D~* of invertible elements of D. 

On the other hand, F and any element of D generate a subfield of D, and this 
subfield is contained in a maximal subfield. Consequently any element of D is 
contained in some such K’, and D = (),.-px xKx~'. Discarding the element 0 
from both sides, we obtain DX = L,-px xK*x~!. Applying Lemma 2.49 to the 
group G = D% and the subgroup H = K™%, we see that K * cannot be a proper 
subgroup of D*. Therefore D = K, and D is commutative. 


10. Frobenius’s Theorem about Division Algebras over the Reals 


We conclude this chapter by bringing together our results to prove the following 
celebrated theorem of Frobenius. 


Theorem 2.50 (Frobenius). Up to R isomorphism the only finite-dimensional 
associative division algebras over R are the algebras R of reals, C of complex 
numbers, and H of quaternions. 


REMARKS. The text of this chapter has not produced any concrete examples 
of noncommutative division rings other than the quaternions. Problems 12-16 at 
the end of the chapter produce generalized quaternion algebras in which R can 
be replaced by many other fields; there are infinitely many nonisomorphic such 
examples when the field is Q. In addition, Problems 17-19 produce examples 
of central division algebras of dimension 9 over suitable base fields. The next 
chapter will give further insight into the construction of division algebras. 
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PROOF. Let D be such a division algebra, and let F be the center. Then 
F is a finite extension field of R and must be R or C, since C is algebraically 
closed. If F = C, then we have seen that D = C. Thus we may assume that 
center(D) = R. 

Let K be a maximal subfield of D (existence by finite dimensionality), and let 
n = dimpg K. Corollary 2.46 shows that dimp D = n*. Since K has to be R or 
C, n has to be 1 or 2. If n = 1, we obtain D = R. Thus we may assume that 
n=2,K =C,and dimpyp D = 4. 

The map f : K — D given by f(a + bi) = a — bi, where i is the member 
of K corresponding to /—1 in C, is an algebra homomorphism into a central 
simple algebra over R, and so is the map g : K — D givenby g(a+bi) =a+bi. 
By the Skolem—Noether Theorem (Theorem 2.41), there exists some x in D with 
x(a + bi)x~! = a — bi for alla and binR. 

This element x has the property that x? commutes with every element of K 
and must lie in K , by Corollary 2.47. Let us see that x” lies in center(D) = R. 
In fact, otherwise 1 and x” would generate K as an R algebra, and every member 
of D commuting with 1 and x* would commute with all of K; since x commutes 
with 1 and x2, x would have to commute with K, contradiction. Thus x? lies 
in R. 

If x? > 0, then x? = r? for somer € R. The elements x and r together lie in 
some subfield K’ of D, and K’ has no zero divisors. Since (x —r)(x +r) =0 
within K’, we conclude that x = +r. Then x commutes with the maximal 
subfield K above, and we arrive at a contradiction. 

Thus x? < 0. Write x* = —y* for some y € R, and put j = y~!x. The 
equation x(a+bi)x~! = a—bi says that j (a+bi) j~! = a—bi and in particular 
that jij~! = —i. Define k = ij. 

We have j? = y~?*x? = —1. Hence k? = ijij = i(jij—!)j? = i(—i)(-1) = 
i? = —1. Thenijk = —1, andk = —1(j7!)@7!) = -1(-jf)(-i) = —ji; 
hence ij + ji =0. 

Let us show that {1, i, j, k} is a linearly independent set over R. Certainly j is 
not an R linear combination of | andi. Ifk = a+ bi+cj forsomea,b,c € R, 
then squaring gives 


l=R =a 4+B'i? +c’ j? + 2abi + 2acj + bc(ij + ji) 


=a? —b? —c? + 2abi + 2acj. 


Equating coefficients of 1,7, and j, we obtain —1 = a* —b* —c?,ab = 0, 
and ac = 0. We cannot have —1 = a”, and thus at least one of b and c is 
nonzero. Then a = 0, andij = k = bi + cj. Left multiplication by i gives 

j =—b+cij = —b+c(bi + cj); equating coefficients shows that b = 0. 
Hence ij = cj, and we arrive at the contradiction i = c € R. We conclude that 
{1, 7, j, k} is linearly independent over R. 
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To complete the proof that D is isomorphic to H,, we have only to verify that 
{1, 7, j, k} satisfies the usual multiplication table for H. We know that i= ie = 
k* = —1, that k = ij, and that k = —ji. The last of these says that ji = —k. 
The other verifications are 

jk = fig = (FOP = (DC) =i, 
kj =ijj =i(-1) = -i, 


ki = iji =i(jij”)j = i(-A)j = J, 
ik =iij =(-Dj =—J, 


and the proof is complete. 


11. Problems 


In all the problems below, all algebras are assumed to be associative. 
1. Let G be a finite group, and let CG be its complex group algebra. Prove that 
CG is a semisimple ring, and identify the constituent matrix algebras that arise 
for CG in Theorem 2.2 in terms of the irreducible representations of G. 
2. Wedderburn’s Main Theorem (Theorem 2.17) decomposes finite-dimensional 
algebras A in characteristic 0 as A = S @ rad A for some subalgebra S. 
(a) What explicitly is a decomposition A = S @ rad A for the complex algebra 
CIX1/(X? + 1)2? 
(b) Is the subalgebra S in (a) unique? Prove that it is, or give a counterexample. 
(c) Answer the same questions as for (a) and (b) in the case of the real algebra 
R[X]/(X? + 1)?. 
3. Let A and B be finite-dimensional algebras with identity over a field F, and 
suppose that B is central simple. Prove that rad(A @f B) = (rad A) @r¢ B. 


Problems 4—7 concern commutative Artinian rings. Let R be such a ring. 

4. Prove that 
(a) R has only finitely many maximal ideals, 
(b) rad R is the set of all nilpotent elements in R, 
(c) R is semisimple if and only if it has no nonzero nilpotent elements, 
(d) R semisimple implies that R is the direct product of fields. 

5. Let e be an idempotent in R/rad R. Prove that the idempotent e € R in 
Proposition 2.23 with e = e + rad R is unique. 

6. Problem 4a shows that R has only finitely many maximal ideals. Let N be their 
product. Use Nakayama’s Lemma (Lemma 8.51 of Basic Algebra, restated in 
the present book on page xxiii) to prove that N is a nilpotent ideal in R. 

7. Deduce from the previous problem that any prime ideal in R contains one of the 
finitely many maximal ideals, hence that every prime ideal in R is maximal. 
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Problems 8-11 concern triangular rings, which were introduced in an example after 
Proposition 2.5. The problems ask for verifications for some assertions that were 
made in that example without proof. The notation is as follows: R and S are rings 
with identity, and M is a unital (R, S) bimodule. Define a set A and operations of 
addition and multiplication symbolically by 


R M rm 
4=(9 S)=1(6 7) 

win rom room’ \ _ (rr rm'+ms' 
0 s 0 s' J} \O ss’ : 


8. Prove that the left ideals in A are of the form 1; © Jo, where J» is a left ideal in 
S and J; is a left R submodule of R @ M containing M/). (Educational note: 


Then similarly the right ideals in A are of the form J; ® J2, where J; is a right 
ideal in R and J2 is aright S submodule of M @ S containing J; M.) 


9. (a) Prove that the ring A is left Noetherian if and only if R and S are left 
Noetherian and M satisfies the ascending chain condition for its left R 
submodules. 

(b) Prove that the ring A is right Noetherian if and only if R and S are right 
Noetherian and M satisfies the ascending chain condition for its right S$ 
submodules. (Educational note: By similar arguments the conclusions 
of (a) and (b) remain valid if “Noetherian” is replaced by “Artinian” and 
“ascending” is replaced by “descending.”) 


QQ 
OZ 


10. fA= ( : ) is any ring suchas ( ) in which Sis a(commutative) Noetherian 
integral domain with field of fractions R and if S  R, prove that A is left 


Noetherian and not right Noetherian, and A is neither left nor right Artinian. 


ll. fA= es .) is a ring such as (eae re in which R and S are fields with 


S C R and dims R is infinite, prove that A is left Noetherian and left Artinian, 
and A is neither right Noetherian nor right Artinian. 


Problems 12-16 concern generalized quaternion algebras. Let F be a field of 
characteristic other than 2, let K be a quadratic extension field, and let o be the 
nontrivial element in the Galois group. The field K is necessarily of the form K = 
F (./m ) for some nonsquare m € F,, and the elements c of K for which o(c) = —c 
are the F multiples of ./m. Fix an element r 4 0 of F, and let A be the subset of 


: a b 
M2(K) given by (es oa) )* 


12. (a) Prove that A is a 4-dimensional algebra over F. 
(b) Prove that A is central simple by examining cx — xc for c = ( oe oo 


when x # 0 is in a two-sided ideal J and is not in K = (6 g )}. 
a(a) 
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13. Prove that A is a division algebra if and only ifr is not of the form Nx; (c) for 
some c € K. Why must A be isomorphic to M2(F) when A is not a division 
algebra? 


14. Prove that ifr andr’ are two members of F such thatr = r’Nx /F(c) for some c 
in K, then the algebra A associated to r is isomorphic to the algebra associated 
tor’. 


15. Let {1,7, 7, k} be the F basis of A consisting of the matrices 


_ (10 . (vm 0 . (01 a 0 Vm 
t=(oi)) §= (0 a) F= (0) Ha Cae) 


Prove that these satisfy i2 = ml, Fig =rl,k* = —rml, ij =k = -ji, 
Jk = —-ri = —kj,and ki = —mj = —ik. 

16. By going over the proof of Theorem 2.50 and using the relations of the previous 
problem, prove that every central simple algebra of dimension 4 over F is of the 
same kind as A for some quadratic extension K = F(,/m) and some member 
rZO0of F. 


Problems 17-19 concern cyclic algebras, which were introduced by L. E. Dickson. 
These extend the theory of generalized quaternion algebras to other sizes of matrices. 
The analogy with the theory in Problems 12-16 is tightest when the size is a prime. 
For notational simplicity this set of problems asks about size 3. Let F' be any field, and 
let K be a finite Galois extension of F with cyclic Galois group. It is assumed in these 
problems that K has degree 3 over F and that {1, 0, 07} is the Galois group. Fix an 


a b c 
element r 4 0 of F,, and let A be the subset of M3(K) given by [ ro() o(@ of) ; 


ro2(b) ro2(c) o2(a) 


a 0 0 
Identifying a € K with the member (‘ o(a) 0 ) of A and letting j be the member 
0 0 oa) 


010 
(« 0 ') of A allows one to view A as the set of all matrices a + bj + ay? with 
r00 


a,b,c € K. The element j satisfies jaj~! 


=o(a) fora € K and Qi =r. 

17. Arguing as for Problem 12, show that A is an algebra over F and that it is central 
simple of dimension 9. 

18. Using the general theory, prove that A either is a division algebra over F or is 
isomorphic to M3(F), and that A = M3(F) if and only if there is a 3-dimensional 
vector subspace of A that is a left A submodule of A. (Educational note: This 
problem makes crucial use of the fact that the size 3 is a prime.) 

19. (a) Prove that ifr = Nx,r(d) for some d € K, then the 3-dimensional vector 

subspace K (1 + d~!j +d~'o(d)~!j”) of Ais a left A submodule. 

(b) Prove that any 3-dimensional left K submodule of A is necessarily of the 
form K (a9 + boj + coj?) for some nonzero ap + bo j + coj? in A and that 
this left K submodule is a left A submodule only if there exists an element 
d €K with Nx;r(d) =r,day =ra(co), dbo = o (ao), and dcp = a (bo). 


CHAPTER III 


Brauer Group 


Abstract. This chapter continues the study of finite-dimensional associative division algebras over 
a field F’, with particular attention to those that are simple and have center F’. Section 5 is a self- 
contained digression on cohomology of groups that is preparation for an application in Section 6 
and for a general treatment of homological algebra in Chapter IV. 


Section | introduces the Brauer group of F and the relative Brauer group of K/F, K being 
any finite extension field. The Brauer group B(F) is the abelian group of equivalence classes of 
finite-dimensional central simple algebras over F under a relation called Brauer equivalence. The 
inclusion F C K induces a group homomorphism B(F') > B(K), and the relative Brauer group 
B(K /F) is the kernel of this homomorphism. The members of the kernel are those classes such 
that the tensor product with K of any member of the class is isomorphic to some full matrix algebra 
M,,(K); such a class always has a representative A with dimr A = (dime K)*. One proves that 
B(F) is the union of all B(K /F) as K ranges over all finite Galois extensions of F. 

Sections 2-3 establish a group isomorphism B(K/F) = H?(Gal(K/F), K*) when K isa finite 
Galois extension of F. With these hypotheses on K and F’, Section 2 introduces data called a 
factor set for each member of B(K /F). The data depend on some choices, and the effect of making 
different choices is to multiply the factor set by a “trivial factor set.” Passage to factor sets thereby 
yields a function from B(K/F) to the cohomology group H 2(Gal(K /F), K*). Section 3 shows 
how to construct a concrete central simple algebra over F from a factor set, and this construction 
is used to show that the function from B(K /F) to H? (Gal(K /F’), K*) constructed in Section 2 is 
one-one onto. An additional argument shows that this function in fact is a group isomorphism. 

Section 4 proves under the same hypotheses that H!(Gal(K/F), K*) = 0, and a corollary 
makes this result concrete when the Galois group is cyclic. This result and the corollary are known 
as Hilbert’s Theorem 90. 

Section 5 is a self-contained digression on the cohomology of groups. If G is a group and ZG is 
its integral group ring, a standard resolution of Z by free ZG modules is constructed in the category 
of all unital left ZG modules. This has the property that if M is an abelian group on which G acts 
by automorphisms, then the groups H”(G, M) result from applying the functor Homzg(-, M) to 
the members of this resolution, dropping the term Homzg(Z, M), and taking the cohomology of 
the resulting complex. Section 5 goes on to show that the groups H”(G, M) arise whenever this 
construction is applied to any free resolution of Z, not necessarily the standard one. This section 
serves as a prerequisite for Section 6 and as motivational background for Chapter IV. 

Section 6 applies the result of Section 5 in the case that G is finite cyclic, producing a nonstandard 
free resolution of Z in this case. From this alternative free resolution, one obtains a rather explicit 
formula for H7(G, M) whenever G is finite cyclic. Application to the case that G is the Galois group 
Gal(K /F) for a finite Galois extension gives the explicit formula B(K/F) = F*/Nx/F(K*) for 
the relative Brauer group when the Galois group is cyclic. 
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1. Definition and Examples, Relative Brauer Group 


The “Brauer group” of a field allows one to work with the set of all isomorphism 
classes of finite-dimensional central division algebras over the field. The basic 
theory in principle reduces the study of all such division algebras to questions in 
the cohomology theory of groups. The latter theory was introduced in Chapter VII 
of Basic Algebra and will be developed further in the present chapter and the next. 

Let F be a field. Theorem 2.4 shows that every finite-dimensional central 
simple algebra A over F is of the form A = M,,(D) for some uniquely determined 
integer n > 1 and some finite-dimensional central division algebra D over F that 
is uniquely determined up to F isomorphism. We can introduce an equivalence 
relation for finite-dimensional central division algebras over F that exactly mir- 
rors the relation of F isomorphism of the underlying finite-dimensional central 
division algebras. Specifically if A = M,(D) and A’ ~ M,,(D’) are two such 
central simple algebras for the same F such that D = D’, then we say that A 
is Brauer equivalent to A’, and we write A ~ A’. It is immediate from the 
definition that “Brauer equivalent” is an equivalence relation. We shall introduce 
an abelian-group structure into the set of Brauer equivalence classes, hence into 
the set of isomorphism classes of central finite-dimensional division algebras 
over F. 


Proposition 10.24 of Basic Algebra gives the definition of the tensor product 
of two F algebras! over F,, and this operation is associative, up to canonical 
isomorphism, by Proposition 10.22. It is also commutative, up to canonical 
isomorphism. In fact, if A and B are given algebras over F’,, then the canonical 
vector-space isomorphism g : A@r B > B@,r Ais given by g(a @b) =b@a. 
If a; ® b, and ay ® bp are given, then the computation 


g(a; ® bi) y(az ® bz) = (b1 ® a1) (b2 @ ay) = by br ® ayan 
= 9(a)a2 ® bib2) = v((ai ® bi) (az @ br)) 


shows that g respects multiplication. Hence tensor product is commutative for 
algebras, up to canonical isomorphism. 


Lemma 3.1. If F is a field, then 


(a) M,(R) = R @r M,,(F) for any algebra R with identity over F, 
(b) Mn(F) @r Mi(F) = Monn (F). 


PROOF. For (a), the F bilinear map (r, [a;;]) +> [raj;j] of R x M,(F) into 


‘All algebras in this chapter are understood to be associative. 


1. Definition and Examples, Relative Brauer Group 125 


M,,(R) has a unique linear extension g to an F linear map of R ®@r M,(F) into 
M,,(R). The map ¢ has 


o((r ® [aij] 0’ ® [a;,])) = g(r’ ® [aij lla; ;]) 
=rr'[laj\la;,] 
=r[ajlr'[a;,] since each aj; is in F 
= g(r @r lai) eC’ ® [a;,)). 


and hence ¢ is an F algebra homomorphism. If {r;} is a vector-space basis of R 
over F and if {£;;} is the usual basis of M,,(F), then g(r; ® Ei) =r, Ej;, and it 
follows that g carries a vector-space basis onto a vector-space basis. Hence ¢ is 
one-one and onto. 

For (b), the result of (a) gives Mn(F) @r Mn(F) = M,(M,(F)), and the 
algebra on the right is isomorphic to the algebra M(mn)(F') of matrices of size mn 
by the multiplication-in-blocks isomorphism. 


Proposition 3.2. For the field F, the operation of tensor product on finite- 
dimensional central simple algebras over F descends to an operation on the set 
of Brauer equivalence classes of such algebras and makes this set into an abelian 
group. 

PROOF. The tensor product of two finite-dimensional algebras over F is again 
a finite-dimensional algebra, and Proposition 2.36 shows that the tensor product 
of two central simple algebras is again central simple. Hence tensor product is 
well defined as an operation on finite-dimensional central simple algebras over 
F.. Let us see that tensor product is a Brauer class property. Thus suppose that 
A ~ A’ and B ~ B’, say with A = M,,(D), A! = My (D), B = M,,(E), and 
B’ = M,,(E). Since the tensor product of some M,(F) with an algebra over F, 
up to isomorphism, does not depend on the order of the two factors and since 
tensor product is associative up to isomorphism, Lemma 3.1 gives 


A @p B= My(D) @F M,(E) = D @r Mn(F) @r Mn(F) Or E 
= D@r Monn) (F) QF E= Monn) (F) QF D ®@r E 
= Monn)(D @F E). 


Similarly A’ @¢ B’ = Minny)(D @F E). Thus A @r B~ A! @e B’. 

We have observed that the tensor product operation on algebras over F is 
associative and commutative, up to canonical isomorphisms, and hence so is the 
product operation on Brauer equivalence classes. The class of the 1-dimensional 
algebra F is the identity, and the class of the opposite algebra A° is an inverse to 
the class of A because of the isomorphism A @- A° = M,,(F) given in Corollary 
2.38. 
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The abelian group of Brauer equivalence classes of finite-dimensional central 
simple algebras over F is called the Brauer group of F and is denoted by B(F). 
We use additive notation for its product operation. 


EXAMPLES ALREADY SETTLED IN CHAPTER II. 
(1) If F is algebraically closed, then B(F) = 0. 
(2) If F = R, then B(F) = Z/2Z by Frobenius’s Theorem (Theorem 2.50). 


(3) If F is a finite field, then B(F’) = 0 by Wedderburn’s Theorem about finite 
division rings (Theorem 2.48). 


The group structure for B(F’) given in Proposition 3.2 offers little help by 
itself in identifying the finite-dimensional division algebras over a particular field. 
Instead, the usual procedure for understanding B(F’) is to isolate certain special 
subgroups of B(F’), known as “relative Brauer groups” and denoted by B(K/F), 
K being any finite extension of F. Under the assumption that K is a finite Galois 
extension of F', Theorem 3.14 below says that B(K/F') is isomorphic to the 
cohomology group H*(G, N), where G is the finite group G = Gal(K/F) and 
N is the (abelian) multiplicative group K * of the field K . This cohomology group 
is in principle manageable. Corollary 3.9 below says that B(F) is the union over 
all finite Galois extensions K/F of B(K /F), and we therefore obtain a handle 
on B(F). 

If A is any finite-dimensional central simple algebra over F and if K /F is any 
field extension, then Proposition 2.36a shows that A @- K is simple as a ring, and 
Lemma 2.35b shows that A ® K has center K. Therefore A @,- K is acentral 
simple algebra over K , and its Brauer equivalence class is a member of B(K). 

Let us see that this map of algebras A into B(K) depends only on the Brauer 
equivalence class of A in B(F’). Thus suppose that A = M,,(D) and A’ = M,,(D) 
for some finite-dimensional central division algebra D over F'. Lemma 3.1a gives 
us isomorphisms of F algebras 


A @r K = Mn(D) @r K = (Mn (F) @r D) Or K 
= Mn(F) @p (D Sr K) = My, (D ®F K), 


and similarly A’ @r K = M,(D ®f K). In each case the left member of 
the isomorphism is a K algebra, with K contained in the center. Thus we can 
view each of our isomorphisms as isomorphisms of central simple K algebras. 
Since D @  K is a finite-dimensional central simple K algebra, we know that 
D @-r K = M,(E) for some finite-dimensional central division algebra E over 
K. Application of Lemma 3.1b allows us to continue the displayed isomorphisms 
as 
A@p K = M,(D @¢ K) = My (M,(E)) = Min) (E). 
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Similarly we have A’ @r K = Miny)(E). Thus A @¢ K and A’ @¢ K yield the 
same member of 6(K ), and (-)@®  K induces a well-defined function from B(F) 
into B(K). 

The function from B(F) into B(K) is a group homomorphism. In fact, if A and 
B are finite-dimensional central simple over F’, then we have K isomorphisms 


(A @r K)@x (B@r K)=A@r (K @x (B @ K)) 
~A@r(B@r K)=(A@r B)@rK, 


and the map is indeed a group homomorphism. 

In addition, the resulting homomorphism satisfies the expected compatibility 
condition with respect to compositions. In more detail, if we have nested fields 
F CK CL, then the L isomorphisms 


(A@r K)@xLZE=A@r(K @k L)=A@rFL 


show that the composition of tensoring with K over F, followed by tensoring 
with L over K, yields the same result as tensoring directly with L over F. 

We define the relative Brauer group 5(K /F) to be the kernel of the homo- 
morphism of B(F’) into B(K). The members of the group B(K /F) are the Brauer 
equivalence classes of finite-dimensional central simple F algebras A such that 
A @Ff K is F isomorphic to M,,(K) for some n. We say that such algebras are 
split over K , that K splits such algebras, and that K is a splitting field for these 
algebras and their Brauer equivalence classes. 


Theorem 3.3. Let K/F be a finite extension of fields. Then K is a splitting 
field for a given member X of B(K/F) if and only if there exists an algebra A 
over F in the Brauer equivalence class X containing a subfield K’ isomorphic to 
K such that dimp A = (dimr K’)?. 


REMARKS. 

(1) The theory of the Brauer group makes repeated use of this result. Corollary 
2.47 shows that the subfield K’ of A is a maximal commutative subalgebra of A 
and in particular is a maximal subfield of A. 

(2) Observe that the field K is given in the theorem, and hence the integer n = 
dimy K is known. Then A must have dimension n. The equality dimr A = n7 
determines A up to F isomorphism. In fact, Theorem 2.4 shows that A = M,(D) 
for a central division algebra whose isomorphism class is determined by the class 
X. Then n* = dimp A = r* dime D, andr? = n?/ dimr(D). So A is indeed 
determined up to F isomorphism. 

(3) In view of the previous remark, any class X in B(K / F) has a distinguished 
representative that is unique up to F' isomorphism; the distinguished representa- 
tives of the members of 6(K /F) for fixed K all have the same dimension. 
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PROOF. Suppose that A is a central simple algebra in the Brauer equivalence 
class X containing a subfield K’ isomorphic to K suchthatdimr A = (dim y K’)*. 
We are to prove that K’ splits A. Write n for dimp K’, so that dimp A = 
n?. Regard A as an n-dimensional K’ vector space with K’ acting by right 
multiplication on A. Define an F bilinear mapping f : A x K’ > Endx’(A) by 


f (a, c)(a’) = aac; the image f (a, c) is in Endx/ (A) because 
f(a,c)(a'c) =aa'c'c = (aa'c)c' = (f(a, cy(a’))c’. 


Extend f without changing its name to an F linear mapping f : A @r K’ > 
Endx/(A) such that f(a ® c)(a’') = aa’c. The mapping f is actually K’ linear 
because 


f(a @c)c\a') = fa @cec')a@') =aa'cc' = (fa@oa@))c. 
Also, it respects multiplication, since 


fa®c)(fa@ ® c’)(a")) = fa@cjaa"c) =aa'a'c'c = aa'a'cc 
= f(aa' @cc'\(a") = f(a@@c)(a @c’)\(a"). 


Thus f is a homomorphism of K’ algebras. The domain A @-f K’ is central 
simple over K', as we saw when setting up the homomorphism B(F’) > B(K), 
and therefore f is one-one. Since A@- K’ and Endx:(A) both have K’ dimension 
n’, f has to be onto. Thus f exhibits A @p K’ as isomorphic to a full matrix 
ring over K’, and K’ splits A. 

Conversely suppose that K is a splitting field for the members of the class X 
in B(F). Let D be a division algebra in the class X. Since B(K /F) is a group 
and therefore contains the inverse class D°, we must have D° @r K = M,,(K) 
for the integer m such that dimr D° = m?. Let us rewrite this K isomorphism as 
D° @r K = Endx(K”). The algebra End-(K”) is central simple over F’, and 
up to an isomorphism, it contains the K algebra D° @- K and hence also the F 
algebra D° @r F = D°. Let A be the centralizer of D° in Endr(K”). We shall 
prove that A has the required properties. 

The algebra A contains (center D°) ®» K , which is a subfield K’ isomorphic 
to K because D° is central over F, and A is simple by the Double Centralizer 
Theorem (Theorem 2.43). The center of A matches the center of the centralizer 
of A, which is the center of D° by Theorem 2.43, which in turn is F. Thus A is 
central simple over F. Yet another application of Theorem 2.43 gives 


(dimp A)(dimp D°) = dime Endp(K”) = m?(dimp K)’*. (x) 


Since dim D° = m7”, we see that dime A = (dime K)?. Thus the subfield K’ 
of A isomorphic to K has the required dimension. 
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To see that A is in the Brauer equivalence class X, start from the F bilinear 
map A x (D° @f F) — Endr(K”) given by (a,d @ 1) > ad, and form its 
F linear extension g : A @p (D° @r F) — Endr(K"). The map ¢ respects 
multiplication because the members of A commute with the members of D°@ F: 


g(a®d® 1D) (g(a @ (d'® 1))(v)) = y(a @ (d @ 1))(a’'d'v) = ada'd'v 
= aa'dd'v = g(ad' ® (dd' ® 1))(v). 


Since A ® - (D° @F F) is simple by Proposition 2.36, g is one-one. A look at 
(*) shows that 


dimr(A @F (D° @- F)) = (dime A)(dimr D°) = dimp Endr(K™) 


and allows us to conclude that g is onto. Therefore A @r D° = Endr(K”). 
Since Endy (K”) is Brauer equivalent to F , the Brauer equivalence class of A is 
the inverse of the class of D°. Hence the class of A equals the class of D, which 
is X, 


Corollary 3.4. If D is a finite-dimensional central division algebra over the 
field F, then any maximal subfield K of D splits D. 


PROOF. This is the special case of Theorem 3.3 in which A = D. The formula 
for the dimensions holds by Corollary 2.47. 


Corollary 3.5. If F is a field, then the Brauer group B(F) is the union of all 
relative Brauer groups B(K /F) as K ranges over all finite extensions of F’. 


REMARKS. This result is all very tidy but is not very useful, since we have no 
indication how to identify 6b(K /F) for a general finite extension F’. In Corollary 
3.9 below, we sharpen this result to make K range only over the finite Galois 
extensions of F’, and we shall see in Section 3 that B(K /F) can be realized for 
such fields K in terms of the cohomology of groups. 


PRoor. Any member of B(F) has some central division algebra D as a 
representative, and Corollary 3.4 identifies an extension field K of F that splits 
D, namely any maximal subfield of D. 


Corollary 3.6. Let D be a finite-dimensional central division algebra over a 
field F, and let dimp D = n’. If K isa splitting field for D, then dimp K is a 
multiple of n. 


Proor. If K is a splitting field for D, then Theorem 3.3 says that there 
exists an integer r such that M,(D) contains a subfield K’ isomorphic to K with 
dim M,(D) = (dime K’)*. Thus r2n? = (dime K)*, andrn = dime K. 
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Theorem 3.7 (Noether-Jacobson Theorem). If D is a noncommutative finite- 
dimensional central division algebra over the field F’, then there exists a member 
of D that is not in F and is separable over F.. 


REMARKS. Within a field extension K/F, we know from Corollary 9.31 of 
Basic Algebra that the subset of all elements of K that are separable over F is 
a subfield of K containing F. Consequently an equivalent formulation of the 
theorem is that D contains a nontrivial separable extension field of F’. 


PROOF (Herstein). Arguing by contradiction, suppose that no element of D 
outside F is separable over F. Let the characteristic of F be p, necessarily 
nonzero. If a is any element of D not in F’,, then the assumed nonseparability 
implies that the minimal polynomial f (X) of a over F has f’(X) = 0, according 
to Proposition 9.27 of Basic Algebra. Hence f(X) = fi (X”) for some polyno- 
mial f;(X) in F[X]. In turn, the minimal polynomial of a? is f;(X), and if a? is 
notin F, then f,(X) = f2(X?) for some polynomial f)(X) in F[X]. Since the 
degree decreases at each step as we pass from f to f;, from f; to f2, and so on, 
we conclude that a” is in F for some e. In short, each a in D has the property 
that there is some integer e > 0 depending on a such that a”’ is in F. 

In view of the assumption that D # F and the argument that we have just 
seen, there exists an element a in D outside F such that a? is in F’. Define a 
function d : D — D by d(x) = xa — ax. The function d is F linear, and it is 
not identically 0 because a is not in the center F of D. If r and / denote right 
and left multiplication, we can rewrite d as d(x) = (r(a) — [(a))(x). The linear 
maps r(a) and /(a) commute with each other, and thus the Binomial Theorem is 
applicable in computing d? (x) as 


d? (x) = (r(a) —1(a))?@) = (ra)? — L(@)?)(x) = xa? — p*x =0, 


the last equality holding because a? is in F and is therefore central. Since d? is the 
zero function and d is not, there exist an integer s with 2 < s < p andan element 
y in D with d’-'!y #4 0 and d’y = 0. Put x = d°~'y. Since x = d(d*~*y), the 
element w = d°~*y has the property that x = wa — aw. The condition dx = 0 
says that xa = ax. Put x = au. The elements a and u commute because a and 
x commute. If we setc = wu7!, then x = wa — aw = cua — acu, and hence 
a = xu~' = cuau~|—ac. Since a and u commute, we obtain a = ca—ac. Right 
multiplying by a~! gives 1 = c — aca™! and therefore c = 1 + aca™!. Raising 
both sides to the p® power gives ch = 1 + ac” a~', The first paragraph of the 
proof shows that there is some e’ > 0 for which c? isin F , and for this integer 
e’, we obtain the contradictory equation ch = 1 + c? from the commutativity 
of a with F’. This completes the proof. 
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Corollary 3.8. If D is a noncommutative finite-dimensional central division 
algebra over the field F and if K is a subfield of D that is separable over F’, then 
there exists a maximal subfield L of D containing K such that L is separable 
over F’. 


PROOF. Because of the finite dimensionality, we may assume without loss 
of generality that K is not properly contained in any larger subfield of D that 
is separable over F'. Arguing by contradiction, we may assume that K is not a 
maximal subfield of D. Let E be the centralizer of K in D. This is a division 
algebra over F’. It is simple by the Double Centralizer Theorem (Theorem 2.43), 
and it contains K because K is commutative. Moreover, we know from Theorem 
2.43 that 


and that K is the centralizer of E. The latter condition shows that the division 
algebra E is central simple over K. Since K is not a maximal subfield of D, 
Corollary 2.46 gives dimr D > (dimr K)?. Thus dime K < dimp E. Since E 
is central over K , E is noncommutative. 

Application of Theorem 3.7 produces an element x in E outside K that is 
separable over K. Let L be the subfield K(x) of E. Since K is a separable 
extension of F’, the Theorem of the Primitive Element gives an element a of K 
such that K = F(a). Then L = F(a,x). The implication (b) implies (c) in 
Corollary 9.29 of Basic Algebra shows that if a is separable over F and x is 
separable over F(a), then a and x are both separable over F'. The elements of L 
that are separable over F form a subfield of L, and we have just proved that this 
subfield properly contains K . This conclusion contradicts the assumption that K 
is a maximal separable extension of F within D, and the proof is complete. 


Corollary 3.9. If F is a field, then the Brauer group B(F) is the union of 
all relative Brauer groups B(K /F) as K ranges over all finite Galois extensions 
of F. 


REMARKS. This is the result of interest. Each B(K/F) with K as in the 
corollary will be seen to be given as an H? in the cohomology of groups, and this 
group is in principle manageable. Thus we obtain a handle on B(F). 


PROOF. If D is a central division algebra over F’,, then Corollaries 3.4 and 3.8 
together show that some finite separable extension K’ of F splits D. That is, the 
Brauer equivalence class of D lies in B(K’/F). Let us write K’ = F(a) by the 
Theorem of the Primitive Element. If f (X) is the minimal polynomial of a over F,, 
then every root of f(X) in an algebraic closure F of F containing K’ is separable 
over F. Let K be the subfield of F generated by all the roots. This is a finite 
normal extension, and Corollary 9.30 of Basic Algebra shows that it is a separable 
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extension. We have seen that the composition of the homomorphisms B(F’) > 
B(K') and B(K') > B(K) is B(F) > B(K), and consequently B(K'/F) C 
B(K /F). Therefore the Brauer equivalence class of D lies in B(K /F). 


2. Factor Sets 


Throughout this section let K/F be a finite Galois extension of fields. Our 
objective is to construct a function from the relative Brauer group B(K /F) into 
the cohomology group H?(Gal(K /F), K*). In Section 3 we shall prove that this 
function is a group isomorphism. 

We take as known the material in Chapter VII of Basic Algebra on cohomology 
of groups. For convenient reference we list the relevant formulas for cohomology 
in degree 2. If G is a group and N is an abelian group on which G acts by 
automorphisms, the group C*(G, N) of 2-cochains is the group of all functions 
a:GxG-— N,the group Z7(G, N) of 2-cocycles is the set of members f of 
C?(G, N) such that 


u(f(v, w)) + flu, vw) = fur, w) + flu, v) for allu,v,w €G, 


the group B?(G, N) of 2-coboundaries is the set of members f of C?(G, N) of 
the form 


fu, v) = u(a(v)) — a(uv) + a(u) forsomea:G— N, 
and the cohomology group H?(G, N) is the quotient 
H*(G, N) = Z°(G, N)/B°(G, N). 


Here it is understood that we are using additive notation for the group operation 
in N and that the action of u € G on a member n of N is denoted by u(n). 

In constructing the function from B(K /F) into H?(Gal(K /F), K*), we shall 
proceed in somewhat the same fashion as for the identification of group extensions 
with an H? that was carried out in Chapter VII of Basic Algebra. Namely we shall 
associate a “factor set” to some choices concerning a given finite-dimensional 
central simple algebra and see that this factor set is a cocyle. Then we shall show 
that the factor set for any set of choices for any Brauer-equivalent central simple 
algebra differs from this cocyle by a coboundary. The result will be the desired 
function from B(K /F) into H*(Gal(K /F), K*). 

Thus write G for Gal(K /F), fix a Brauer equivalence class X in B(K /F), and 
let A be acentral simple algebra in the class X meeting the conditions of Theorem 
3.3: A contains a subfield K’ isomorphic to K, and dimr A = (dim K’)”. Write 
ct c’ for the isomorphism K > K’. 
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Let o be an element of the Galois group G. Thenc & c’ andc b a(c)’ 
are two algebra homomorphisms of the simple algebra K into the central simple 
algebra A, and the Skolem—Noether Theorem (Theorem 2.41) says that they are 
related by an inner automorphism: 


gle) Ssee%5 for some x € A. 


Some choice is involved in selecting x,, but the element x, is unique up to a 
factor from K’ on the right. In fact, if x, and y, both behave as in the boxed 
formula, then y> x, commutes with K’ and hence is in K’. Thus xg = Yop With 
co in K’. 

The nonuniqueness can be expressed also in terms of a factor from K’ on the 
left. In fact, the boxed formula for c = co implies thatx, = (xocpx5 NGC, ') — 
a(Co)'Yo- 

At any rate, fix a choice of x, for all o € G, and let us examine the effect of 
composition. If o and t are in G, then 


Xgre'x5) = (at)(cY =a(t(C)) =xet(c)'x¢! = xexze'xy x5}, 
Using the result of the previous paragraph, we see that x,; and x,x, are related 
by a factor from K’ on the left. Hence we can write 


XgXr = a(o, T)'Xor with a(o,t) € K”. 


If we examine the effect of composing three elements of G, we obtain a 
consistency condition that the functiona : G x G — K™* must satisfy. Namely, 
let p,o0, and t be in G, and let us compute x,x,x, in two ways, taking advantage 
of the associativity in A. With one grouping, we obtain 


XpXorXr = (XpXoq)Xr = a(p, G) tag hy = a(p, o)'a(po, T) age: 
and with the other grouping, we have 


XpXaoXr = Xp (XoX7) = Xpaco, ©) Kive 


II 


place, ED) tahoe = plato, t))'a(p, GE) pas: 


Therefore the function a : G x G > K™ satisfies 


p(a(o, T))a(p, oT) =a(p,o)a(po, T). 


A function a : G x G — K™ satisfying the above boxed formula is called a 
factor set. From A, an isomorphism K — K’, and a choice of the elements x, 
for o € G, we have obtained a factor set. 
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Comparing this boxed formula with the formulas in the second paragraph of 
this section, we see that a factor set is exactly a member of Z?(Gal(K/F), K*) 
except that the boxed formula uses multiplicative notation for K * and the defi- 
nition of 2-cocyle uses additive notation. Thus we have associated a member of 
Z*(Gal(K /F), K*) to the triple consisting of A, an isomorphism K —> K’, and 
a choice of the elements x, foro € G. 

With the extension K /F and the class X € B(K /F) fixed, let us see the effect 
on the factor set of making different choices. The algebra A lies in the Brauer 
equivalence class X and has dimr A = (dimr K )*. As we saw in the remarks 
with Theorem 3.3, A is determined up to isomorphism by these properties. 

Thus let us start from a different system of choices: an algebra B in the 
class X, an isomorphism K — K”", and elements y, for o € G such that 
a(c)" = ygc"y_!. Define the corresponding factor set b : G x G > K™ by 


YoYr = b(o, %) Mens 


We wish to relate a(o,t) and b(o, Tt). We have just seen that A and B are 
isomorphic as algebras. Let g : A — B be an isomorphism. Then c +> c’ 
y(c’) andc + c” are two algebra homomorphisms of K into B, and the Skolem— 
Noether Theorem (Theorem 2.41) produces an element t € B with 


c" =ty(c)t! for allc € K. 


Starting from the formula o(c)’ = x,c'x>!, apply g and conjugate by r to obtain 


o(c)" = to(o(c)')t7! = (te@o)te"(teo)t7!) | 


This equation says that ty(x, )t~! serves the same purpose as y,, and therefore 


Yo = cto(Xo)t | 


for some member c” of K” placed on the left. Substitution into the formula 
YoYr = Do, Tb)" Vor gives 


Wot 


cl te(%o)t elte(xz)t | = b(o, t)"cl .te(%er)t I. 


If we substitute from the formula c” = tg(c’)t~! for all members of K” and then 
conjugate by t~! and apply g~!, we obtain 


Ode te — DGD) O ekaix 
The left side equals 


C0( Cz) Nee = co (cz)'a(o, T) Mates 


3. Crossed Products 135 
and comparison of this expression with the right side gives 
b(o, t)'cL, =clo(cr)'a(o, TY’. 


Passing from K’ back to K , we conclude that 


b(O, T)Cot = CoO (Cz) al, T). 


This formula says that b is the product of a and the trivial factor set c : Gx G > 
K™* given by 
-1 


c(0,T) = CeO (Cr)Cy,, 


where o +> c, is some function from G to K*. Again referring to the second 
paragraph of this section and remembering that we are using multiplicative no- 
tation for K*, we see that the trivial factor sets are the 2-coboundaries, lying 
in B*(Gal(K/F), K*), in the same way that the general factor sets are the 
2-cocycles, lying in Z*(Gal(K/F), K*). We have thus proved the following 
proposition. 


Proposition 3.10. Let K be a finite Galois extension of the field F. For 
X in B(K/F), let A be an algebra in the Brauer equivalence class X with 
dimr A = (dimr K)?, let K > K’ be an isomorphism of K into A, and let 
{xo | o € Gal(K/F)} C A®* bea set of elements such that o(c)’ = ae ae 
Then the passage from X to the factor set determined by the triple of data 
(A, K-—K', {x,}) descends to a well-defined function from the abelian group 
B(K /F) to the abelian group H?(Gal(K /F), K*). 


3. Crossed Products 


In this section we continue to assume that K/F is a finite Galois extension of 
fields. We are going to show that the function B(K /F) > H?(Gal(K/F), K*) 
given in Proposition 3.10 is an isomorphism of groups. The homomorphism 
property comes last and is the hard part of the argument. In the meantime, 
we construct the inverse function by associating an algebra to each member of 
Z* (Gal(K / F), K*) and showing in Corollary 3.13 that the resulting function on 
Z* (Gal(K /F), K*) descends to an inverse function from H*(Gal(K/F), K*) 
into B(K/F). The algebra is called a “crossed product” and is produced in 
Proposition 3.12 below. Before either of these steps, we establish one more 
property of the system {x, | o € Gal(K /F)} of the previous section that has not 
needed mentioning until now. 
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Thus let a central simple algebra A be given with dimr A = (dimr K ye along 
with an isomorphism K — K’' denoted by c + c’. As in the previous section 
we choose x, € A* with 


a(c)’ = xec'xz! forallc € K. 


The corresponding factor set a(o, T) has 
XgXt = a(0,T)'Xor- 


We regard A as a vector space over K’ with K’ acting by multiplication on the 
left. 


Lemma 3.11. With hypotheses as above, the set {x, | o € Gal(K/F)} is a 
vector-space basis of A over K’. 

PROOF. Let G = Gal(K /F). Since |G| = dimr K = dimg K’ = dimg: A, it 
is enough to prove linear independence. Arguing by contradiction, assume that 
the set {x, | o € G} is linearly dependent. Choose a maximal subset J of G 
such that {x; | t € J} is linearly independent. For o not in J, we then have 


Kp ae with a, € K. (x) 
tel 


Every c in K satisfies 


C(O sete =] ae = Ya tee, 


ted tes 
and thus x, = ><; a(c) alr) xz. Comparing this expansion with (x) 
shows that 
a(c) atc) =a forte J. (0) 


Since x, #0,some a! in the expansion (*) is nonzero. For this t, we can cancel 
a’. in («*) and obtain o(c)’ = t(c)’ for allc € K. Theno = 7, in contradiction 
to the fact that o is not in J. 


The linear independence in Lemma 3.11 allows us to read off the structure of 
A: as a K’ vector space, the algebra A is given by A = Deecal(K/F) K'x_, and 
the elements x, have the properties that 


Xgc’ = a(c)'xg forc € K and XgX_ = a(0,T)'Xor- 


Proposition 3.12 is motivated by these formulas, saying that we can reconstruct 
A from a given 2-cocycle a(o, T) in such a way that these formulas hold. 
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Proposition 3.12. Let K /F be a finite Galois extension, and let a = a(o, T) 
be in Z?(Gal(K/F), K*). Then there exist a central simple algebra A over F 
with dimy A = (dimy K)*, an isomorphism K — K’ of K onto a subfield K’ of 
A, and a subset {x, € A | o € Gal(K /F)} such that 


(a) A= Do ecal(K /F) K'Xxo, 
(b) xgc'x> ' = o(c) forall c in K, with c + c’ denoting the isomorphism 


of K onto K’, 
(c) XoXt = a(o, t) News 


REMARKS. We write A = A(K,Gal(K/F),a) and call A the crossed- 
product algebra corresponding to the factor set a. The algebra A is completely 
determined by the given conditions, up to canonical isomorphism, since (a), (b), 
and (c) determine the entire multiplication table of A. 


PROOF. Let G = Gal(K/F), form a set {x, | o € G}, and let A be the K 
vector space (free K module) with basis {x,}. Then A = @,.g Kx. Define a 
multiplication on K basis vectors in A by 


(CXg)(dx7) = co(d)a(o, T)Xor, (*) 


and extend it to a multiplication on A by additivity. 
First we shall check that A is an associative F algebra with a(1, 1)~'x1 as 
identity by making use of the cocycle property 


p(a(o, T))a(p, oT) =a(p, o)a(po, T). () 
For associativity, («) gives 
(bxp)((cXo(dxr)) = (bxp)(co (d)a(o, T)Xor) 
= bp(c)(pa (d)) p(aa, T))a(P, OT) Xpor 


and 
((bxp)(cXq)(dxz) = (bp (c)a(p, 7) Xpa)(dxz) 


= be(c)a(p, a)pa(d))a(po, T)X pars 


and the right sides are equal by (*«). To see that a(1, 1)~!x1 is a two-sided 
identity, take 9p = o = 1 in (**) to get 1(a(1, t))ad, tT) = ad, Lad, tT). Since 
a takes values in K *, we can cancel and obtain 


a(1,t) =a(1, 1). (t) 
Thus (*) gives 


(a(1, 1)7'x1)(dxz) = al, LIN d)a(L, t)x_ = dxz. 
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Similarly another specialization of («*) is o(a(1, 1))a(o, 1) = a(o, lato, 1), 
from which we obtain 


o(a(1,1)) =a(o, 1). (Tt) 
Thus (*) gives 


(exe) (a(1, 1)" 'x1) = eo'(a(1, 1) ae; 1) xe = exe, 


and a(1, 1)~!x, is indeed a two-sided identity. We denote it by 1. Scalar multi- 
plication by r € F is understood to be the additive extension of r (cx) = (1c) Xo 
for c € K, and the identities 


(r (cxq))(dxz) =rco(d)a(o, T)Xor, 
(cxg)(r(dxz)) = co(rd)a(o, T)Xer =rco(d)a(o, T)Xor, 


r((cx¢)(dxz)) =rco(d)a(o, T)Xer 


show that multiplication in A is F linear with respect to scalars, hence show that 
A is an algebra over F’. 

Second we define K’ C A and an isomorphism K — K’. For b € K, we let 
b’ be the member of A given by b’ = b1 = b(a(1, 1)~!x1), and we let K’ be the 
image of K under b & Db’. The map b +> D’ certainly respects addition, and it 
respects multiplication because the identity 


(bya, I7!x1) (boa, I7!x1) = biboal, Yo! xy 


is immediate from (*). Hence K’ is a subfield of A. 
Third we prove properties (a), (b), and (c). For (a), we use () and (+) to obtain 
the identity 


b'xg = (ba(1, 1)7!x1)xg = ba(1, 1)7!a(1, o) xg = bXg. (4) 


This identity shows that K’x, = Kx,, and (a) follows. From ({), we see 
also that x, (bx,-1) = (1x,)(bx,-1) = lo(b)a(o, o~!)x; and that (bx,-1)x%¥. = 
bo (l1)a(o~!, o) x1; thus x, has a right inverse in A and also a left inverse, hence a 
two-sided inverse. Consequently the statement of (b) is meaningful; for its proof 
we have only to observe that 


tect, = (easly alee HSeloeeD) aeeex 


See), 


= a(c)Xo x5! =a(c)' xox, 
the last three equalities following from (tf), (£), and the identity x,x> P=], 
For (c), we have 
XoXt = a(o, T)Xor = ao, D) Aees 
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the second equality following from (+). 

Fourth we show that A is simple. Let J be a proper two-sided ideal in A, and 
let gp: A — A/I be the quotient homomorphism. Since | is not in J and since 
K' is a subfield of A, we know that ker(g x) = 0 and that 9(K’) is a subfield of 
A/I. The field g(K’) acts on A/J by left multiplication and makes A// into a 
y(K") vector space. The members g(x.) of A/J certainly span A/J over y(K’) 
because of (a), and the claim is that they are linearly independent. If so, then g 
is one-one, J equals 0, and A is simple. For the linear independence, we argue 
by contradiction in the same way as for Lemma 3.11. Suppose that J C Gisa 
maximal subset such that {y(x,) | t € J} is linearly independent over g(K’). 
For o not in J, we then have 


P(Xo) = D2 (a, )y (xr) with a; € K. (£4) 
ted 


Every c in K satisfies 
(F(C)')~(Xo) = G(X) (Cc!) = VF ola er)g(c) = Vi ola etc) ¢(xr), 
teJ teJ 


and thus 
Q(X) = Ve G(O(C)’) Pa y(t (cy par). 


Comparing this expansion with (¢:) shows that 


y(a(c)) 'plal)o(r(c)) = 9a) forte J. (8) 


Since x, is invertible in A, y(x,) is invertible in A/J and cannot be 0. Therefore 
some ¢(a/) in the expansion (£%) is nonzero. For this t, we can cancel g(a‘) in 
(§) and obtain g(a (c)') = g(t(c)’) for all c € K. Since ¢ is one-one on K’, we 
conclude that 0 = tT, in contradiction to the fact that o is not in J. Therefore A 
is simple. 

Fifth we show that A has center F. Thus suppose that }>, ci.xo is central. 
Commutativity with d'x, forces the two expressions 


(cya ae =) 010 Gy) xexe = 36,0) alos ty Xaz 


and 


LA he — A) Cae) Oe ae) x5 


= » d't(Cy-197)'a(T, i GU) kee 
o 


to be equal. Hence 


dT (Cr-197)a(T, t lot) =c,o(d)a(o, T) for all d,o, tT. ($$) 
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Putting d = 1 in (§§) shows that t(c,-1,,)a(t, T !oT) = cga(o, T). Substitut- 
ing from this equation into the left side of ($§) gives 


dcga(o, T) = Coa (d)a(o, T) for all d, o, T. 


If cz #0, we see that o(d) = d for alld € K; thus c, #0 only foro = 1. For 
o = 1 andd = 1, (88) reduces to 


t(c))a(t, 1) = cja(l, T). 
Taking into account (7) and (++), we obtain 
t(cya(l, 1)) = cya, 1). 


Since Tt is arbitrary, this says that cja(1, 1) is in F. Thus the central element is 
6x jer eal, Ded, 1)~!x; = (cja(1, 1))1 and is an F multiple of the 
identity. 

Since {x,} by definition is a basis of A over K, we have dimg A = |G| = 
dime K. Multiplying this equation by dim K yields dime A = (dimp K)’. 
This completes the proof. 


Corollary 3.13. If K is a finite Galois extension of the field F’, then the map 
B(K/F) — H?(Gal(K/F), K*) defined via factor sets is one-one onto. 


PRooF. Put G = Gal(K/F). Ifa: Gx G > K™ is in Z?(G, K*), then 
we can construct an algebra A via Proposition 3.12, and the claim is that the 
map a +> A descends to H?(G, K*) and is a two-sided inverse to the map from 
B(K /F) into H?(G, K*) given in Proposition 3.10. 

First we show that a +> A descends to H7(G, K*). Thus suppose that b is a 
second cocycle and is of the form b(o, tT) = a(o, Tega (ees, 1.e., represents 
the same member of H?(G, K*). Let B be the algebra constructed from b by 
Proposition 3.12, say with K mapping to K” C B viac t+ c” and with 


(a!) B= QDoeg K" Yo for a subset {y,} of B, 
(b’) yoo"ys! = a(c)", 
(c’) Yor = bE, E) Vow: 


Define g : A — B to be the additive extension of the function with g(c'x,) = 
c"cl-!y,. To check that y is an algebra homomorphism, we start from the 


formula (c'x,)(d'x;) = c’o (d)'a(o, T)'Xg7 and apply ¢ to obtain 


o((c'xg)(d'x7)) = c"o (d)"a(o, 1)" Yor. 
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Meanwhile, 


o(c'xe)o(d'x2) = (ect ye) (dct yz) 
= "eg '0 (d)"o(Cr)" BG, 0)" Yor 


= cc! —o(d)"o (er) alo, tcl a (er)"cn Yor: 
Hence p((c'xz)(d'xz)) = g(c'x,)g(d'x,), and g is an algebra homomorphism. 
Since ¢ carries K basis to K basis, g is an algebra isomorphism. 

Thus the map a + A descends to a map from H?(G, K*) into B(K/F). 
Starting from a cocycle a in Z7(G, K*), we can construct A and elements x, by 
Proposition 3.12, we can apply Propositions 3.12b and 3.10 to the x,’s to obtain 
another cocycle a in Z *(G, K*), and we can use Proposition 3.12c to see that 
a = a. In the reverse direction if we start from an algebra A, make a set of 
choices, and form a factor set a by means of Proposition 3.10, then Proposition 
3.12 constructs an algebra A that has to be isomorphic to A because conditions 
(a) through (c) in Proposition 3.12 determine the same multiplication table for an 
algebra as was used in constructing the cocycle a. 


Theorem 3.14. If K is a finite Galois extension of the field F’, then the map 
B(K/F) > H?(Gal(K/F), K*) defined via factor sets is a group isomorphism. 


REMARKS. Put G = Gal(K/F). In view of Corollary 3.13, is enough to 
prove that the mapping Z*(G, K*) — B(K/F) of Proposition 3.12 is a group 
homomorphism. Thus let A, B, and C be the crossed-product algebras A = 
A(K,G,a), B = A(K,G,b), and C = A(K,G,ab). We are to prove that 
A @ Ff B is Brauer equivalent to C. Each of A, B, and C has F dimension 
(dimp K)?, and hence A @f B will not be isomorphic to C. Consequently we 
need to prove Brauer equivalence of two specific nonisomorphic algebras. This 
is the circumstance that makes the proof complicated. 


PROOF (Chase). Let G,a,b, A, B, and C be as in the remarks. We can regard 
A and B as vector spaces over K with K acting on the left in each case. We define 
an F vector space M to be the quotient of A @- B by the F vector subspace I 
generated by all vectors ca @ b—a ® cb witha € A,b € B,andc € K. We 
write M = A @x B for this quotient, even though more standard notation for it 
might be A° ®x B with A®° as aright K module and B as a left K module. 

The subspace / is carried to itself by right multiplication by any member of 
the algebra A @r B and hence is a right ideal. The quotient M is therefore a 
unital right A ®@- B module with (a @x b)(a’ @f b’') = aa' @x bb' fora @x b 
in Manda’ @rb'inA@,s B. 

We shall make the unital right A ®- B module M into a unital (C, A @F B) 
bimodule by introducing an action by C on the left. For this purpose let {u,}, {vc}, 
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and {w,} be the distinguished K bases of the algebras A, B, and C indexed by G 
and used to form A, B, and C from the 2-cocycles a,b, and ab. Given an element 
XW, inC with x € K, define xw, on A @- B to be (left by xu,) ® (left by v,). 
Let us see that this operation carries the generators of J into 7. We have 


(xWe)(Ca @¢ b)—(XWe)(a @p Cb) = XUgCA Bp gb — XUigd Op Voch 
= X0(C)Ugd @F Veb—XUgd @f a(C)vgb 
= 0(C)(XUgQ) @F (Ueb) 
— (XUga) @F a(c)(v,b), 


and the right side is indeed in J. Thus we obtain an operation of x w, on the left 
for A ®x B such that 


(xWo)(A @x b) =XuUgd@Krvob forxe K, co €G,aEeA, DEB. (x) 


We extend this definition by additivity in such a way that all of C operates on the 
left for A @x B. 

The claim is that the additive extension (*) to C makes M = A @x B intoa 
unital left C module. What needs proof is that 1 acts as 1 and that 


((xwy5)(ywr)) (a @x b) = (xwWe)((ywr)(a Bx b)). (4%) 
The element | in C is a(1, 1)~!b(1, 1)~!wy, and we have 
(a(1, 1)'b(, Dwi) (a @x b) =aUl, 1) 1B, 1)" wa @x vb 
=a(l,1)~!wja @x DU, 1)!vjb =a @x b. 
Thus 1 acts as 1. For («), the left side is 
(xo (y)a(o, T)bD(G, T)Wor)(a Ox 5) = xa(y)a(a, T)D(G, T)Uora OK Vorb, 
while the right side is 


(xWo)(yuza &K Urb) = XUg yU;a WK Ug Urb = XO(y)ugu;a &K Ug Urb 


= xa(y)a(o, T)Usrd Ox b(G, T) Verb. 


These are equal, since b(o, tT) is in K and therefore moves across the tensor- 
product sign. 

Thus M is a unital left C module. The left action by C certainly commutes 
with the right action by A @- B, and M is consequently a unital (C, A @f B) 
bimodule. Each member of A @  B therefore yields by its right action a member 
of the ring Endc(M), and we obtain a ring homomorphism of (A @- B)° into 
Endc(M). Since A @- B is a simple ring, this homomorphism is one-one. If we 
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can prove that this homomorphism is onto, then we will have a ring isomorphism 
(A @r B)°? = Endc(M), and the rest will be easy. 

To see that the homomorphism is onto, we shall calculate dimensions. Let 
n = dimp K. Then each of A, B, and C has F dimension n?, and we have 


dimp M = (dimp A)(dimp B)/(dimp K) = n?n?/n =n? = (dime C)n. 


Since the algebra C is simple, every unital left C module is semisimple and is in 
fact isomorphic to a multiple of a simple left C module V. The above dimensional 
equality says that if r is the integer such that C is isomorphic to r V as a left C 
module, then M is isomorphic to nrV. 

Let D° be the division algebra Endc(V). As in the proof of Wedderburn’s 
Theorem (Theorem 2.2), we know for each integer m that 


Endc(mV) = Mm(Endc(V)) = Mn(D°). (+) 
Taking m = r in (+) gives C° = Endc(rV) = M,(D°). Hence 
C = M,(D), Gap) 


and dimp C = r*dimp D. Since dimrC = (dime K)* = n’, we obtain 
dimr D = n*/r?. Taking m = nr in (+) gives 


Endc(M) = Endc (nr V) = My,(D°), (4) 
and we therefore obtain 
dime Endc(M) = n?r? dime D = (n’r?)(n?/r?) = nn’. 
Since dimp(A @pr B) = n+, we obtain dimr(A @p B)® = dime Endc(M), and 
we conclude that the algebra homomorphism (A ®@- B)° — Endc(M) is onto. 
Thus it is an isomorphism, and A @¢ B = (Endc(M))?. 


Combining this isomorphism with (+) shows that A@- B = M,,(D). In view 
of (++), A @f B is therefore Brauer equivalent to C. 


Corollary 3.15. If D is a finite-dimensional central division algebra of dimen- 
sion m? over a field F,, then the m-fold tensor product of D with itself over F is 
a full matrix algebra over F’. 


PROOF. Corollary 3.9 produces a finite Galois extension K of F such that 
K splits D. Write G for Gal(K/F). In view of Theorems 3.3 and 2.4, there 
exists an integer / such that A = M,(D) contains a subfield K’ isomorphic to K 
with dimr A = (dim K’)*. Changing notation, we may redefine K = K’. Let 
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n =dimrp K. Thenn? = dime A =/? dime D = (Im)*,andn = Im. Following 
the construction of factor sets in Section 2 and using Lemma 3.11, we form a 
vector-space basis {xz | o € G} of A over K and a factor set {a(o, t)} such that 
XgX_ = A(G, T)Xor and o(c) = xgcxz! for all cin K. 

Example 1 of semisimple rings in Section II.2 shows that the left A module A 
is the direct sum of / isomorphic simple left A modules. Let V be one of these. 
Restricting the module structure of V from A to K makes V into a unital left K 
module, hence into a vector space over K. Then we have 


n> =dimr A =I/dimp V =1(dimg V)(dimp K) = In dimg V, 


and dimg V = m. Let vj,..., Um be a K basis of V. For each x € A, define a 
matrix C (x) in Mj, (K) by 


m 
XUj = 3 C(x)ijUi- 
i=l 
For o and t in G, we compute x,x,v; in two ways as 


m 
Kok; = AG,4 )ho70i =GG, 1) CGF iti (*) 
i=1 


and as 


Keka ie Y, COD = (CRD se0E = O(C (Xr) KC (Xo )ikVi- 
= pad et 


If we write o (C (x,)) for the result of applying o to each entry of C (x,), then we 
obtain 


¥oxedy = Lo (COo)o(CO))/% (1) 
Comparing («) and («*) leads to the matrix equation in M,,(K ) given by 
a(o, T/C (Kor) = C(Xo)o(C (xz). 
Putting c, = det C(x,) and taking the determinant of both sides yields 
a(O, T)"Cot = CoO (Cr). 
This equation shows that a(o, tT)” is a trivial factor set. Applying Theorem 3.14, 


we see that the m'" power of the Brauer equivalence class of A is trivial. Since A 
is Brauer equivalent to D, the corollary follows. 
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Corollary 3.16. If F is any field, then every element of B(F’) has finite order. 


ProoF. If A is any central simple algebra over F, then Theorem 2.4 shows 
that A = M,(D) for some integer / > 1 and some central division algebra D over 
F.. Corollary 3.15 shows that the Brauer equivalence class of D has finite order 
in B(F). Since A is Brauer equivalent to D, the same thing is true for A. 


4. Hilbert’s Theorem 90 


Let K /F bea finite Galois extension of fields. Our interest in this section will be 
in the cohomology groups H4 (Gal(K /F), K *) with qg possibly different from 2. 
For g = 0, H°(G, N) is always the subgroup of elements of N fixed by every 
element of G. In the case of a Galois extension, the members of K * fixed by the 
Galois group are the nonzero elements of the base field F. Thus we have 


H°(Gal(K/F), KX) = F*. 
In addition, Theorem 3.14 has established an isomorphism 
H?(Gal(K/F), K*) = B(K/F), 


and thus we have already obtained some understanding of this group for g = 2. 

We shall examine H! in a moment, but first we take note of another fact about 
H?. Problem 16b at the end of Chapter VII of Basic Algebra shows that if G 
is a finite group and AN is an abelian group on which G acts by automorphisms, 
then every element of H4(G, N) for g > 0 has order dividing |G|. In particular, 
every element of H (Gal(K /F), K*) has order dividing dimy K whenever K is 
a finite Galois extension of F. Applying Theorem 3.14, we see that every member 
of B(K /F) has order dividing dimy K. In view of Corollary 3.9, this argument 
gives a new and shorter proof of the result of Corollary 3.16 that every member 
of B(F) has finite order. The estimate of the order via Corollary 3.15, however, 
is sharper than the estimate obtained via the shorter proof, and this distinction 
makes all the difference in Problem 12 at the end of the chapter. 

The result concerning H! and its important special case given as Corollary 
3.18 below are known as Hilbert’s Theorem 90. 


Theorem 3.17. If K/F is any finite Galois extension of fields, then 
H'(Gal(K/F), K*) =0. 


PROOF. Let G = Gal(K /F),putn = dimpe K ,andenumerate Gasoj,..., Op. 
By the Theorem of the Primitive Element, we can write K = F(a) for some @ in 
K ,and then {1, a, a7,...,@”~'}isabasis of K over F. Form the n-by-n matrix M 
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with entries in K whose (i, j)" entry is Oj (a‘—!). This is a Vandermonde matrix, 
and Corollary 5.3 of Basic Algebra gives its determinant as | | $4 [oj (a) —o;(a)]. 
This determinant cannot be 0, since oj(@) = o; (a) implies o; (a*) = Oj (a)k = 
o;(a)* = o;(a*) for all k and then oj(x) = 0; (x) for all x. Hence the matrix M 
is nonsingular. 

Let f be a nonzero element in Z'(G, K*). Such a function f : G > K is 
nowhere vanishing and has f(ot) = f(a)o(f(t)) for all o and t in G. Since 
the matrix M is nonsingular, the nontrivial linear combination }°,-¢ f(a)o 
cannot be 0 on all members of the basis {1, a, a7,...,a”~!}. Choose k with 
Voeg f(c)o(a*) = y £0. Applying t € G to this equation, we obtain 

t(y) = Yo t(f@))to@*) = Yo f (to) f(t)“ !to (a*) 


oEG oEG 


= f(t") Y f@)o@*) = f(t) ly. 


oEG 


The equation f(t)~! = t(y)y~! shows that f—! is a coboundary, hence that f 
is a coboundary. 


Corollary 3.18. If K/F is a finite Galois extension with cyclic Galois group 
and if o is a generator of the Galois group, then every member x of K with 
Nx/r(x) = 1 is of the form x = o(y)y~! for some y € K™. 


REMARKS. The instance of this corollary in which K is a quadratic number 
field and F is the field Q appears as Problem 27 at the end of Chapter I. In 
subsequent problems at the end of that chapter, Problem 27 plays a crucial role 
in showing that various possible definitions of genera are equivalent. 


ProoF. Let G = {l,o,07,...,0"~'} be the Galois group, and define a 

function F : Z— K* by F(O) = 1 and 

F(k) =xa(x)o7(x)---o8 '() ~~ fork > 1. 
Then we have 

F(k +1) =xa(x)o*(x)---0**!“!(x) 

= (xo (x)? (x) vee a! (x))o* (x0 (x)o? (x) . -o'!(x)) 

= F(k)o(F), (*) 
The condition that Nx; (x) = 1 is exactly the condition that F(n) = 1. Then 
Fk +n) = Ftk)o"(F()) = F(a) for all k, and it is meaningful to define 
a l-cochain f : G > K* in C\(G, K*) by f(o*) = F(k). Condition (*) 
implies that f (o*o!) = f(o*)o*(f (c')), andhence f isacocyclein Z!(G, K*). 
Theorem 3.17 shows that f is acoboundary in B'(G, K *), necessarily satisfying 


f(t) = t(y)y7! for some y € K* and all t € G. Taking t = o, we obtain 
x = f(o) =a(y)y7!, as required. 
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Our final result concerning H4 (Gal(K /F), K*) for this chapter gives further 
information about the special case in which Gal(K /F) is cyclic, but now for gen- 
eral g. In combination with the study of crossed-product algebras, the case g = 2 
of this result provides a way of constructing new examples of noncommutative 
division algebras. A key step in the proof makes use of a fundamental general 
property concerning cohomology of groups, and we therefore digress in Section 5 
to establish this property. 


5. Digression on Cohomology of Groups 


This section develops general material about cohomology of groups. Although 
the earlier sections of this chapter are helpful for motivation, the results that we 
discuss in this section do not rely on any previous material in this volume. It 
will be assumed that the reader is familiar with the definitions of complexes and 
exact sequences in Chapter X of Basic Algebra, as well as with the application 
of tensor-product functors and Hom functors to exact sequences and complexes. 
The material in Chapter VII of Basic Algebra on cohomology of groups will be 
helpful as background, but it is unnecessary from a logical point of view. If R is 
a ring with identity, we denote by Cp the category of all unital left R modules. 

Let G be a group, not necessarily finite. We shall work with the integral group 
ring ZG of G. It has the universal mapping property that whenever G acts by 
automorphisms on an abelian group M, then the action by G on M extends to 
ZG in a unique way that makes M into a unital left ZG module. 

Here is a brief overview of what is to happen in this section: If G acts on 
the abelian group M by automorphisms, then the abelian group C”(G, M) of 
n-cochains is the set of functions into M from the n-fold product of G with itself, 
the operation being given by addition of the values of the functions. To define the 
cohomology group H”(G, M), one introduces suitable homomorphisms known 
as “coboundary maps” 6, : C"(G,M) > c"*!(G, M) and shows that the 
sequence 


0 —> Co(G, M) % --- 4. €,(G, M) > Cry (G,M) > + 


of abelian groups and homomorphisms is a complex in the category Cz. Then 
it is meaningful to define H”(G, M) = (ker 4,)/Gmage 5,_1) forn > 0 if we 
adopt the convention that image d_; = 0. The first thing that we shall do in this 
section is to exhibit a certain exact sequence in the category Czg such that the 
above complex is obtained from it by application of the functor Homzg(-, M) 
and the dropping of one term of the form Homzg(Z, M). Except for a single 
term Z, the members of this exact sequence will all be free ZG modules, and the 
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exact sequence will be called the “standard resolution of Z in the category Czg.” 
The exactness is proved in Theorem 3.20, and the application of Homzcg(-, M@) 
to it appears after the proof of the theorem. 

The next thing that we shall do is show that if the standard resolution of Z is 
changed to any exact sequence in Czg in such a way that the free ZG modules 
are replaced by other free ZG modules and the module Z is left unchanged, 
then application of Homzg(- , M) to the new exact sequence leads to canonically 
isomorphic cohomology groups. This result appears below as Theorem 3.31. 
In brief, the cohomology groups H"(G, M) can be computed starting from any 
“free resolution of Z” in the category Czg in place of the standard resolution. 

We begin by constructing the “standard resolution of Z.” For n > 0, let Fy, 
be the free abelian group with Z basis the set of all (n+1)-tuples (go, ..., n) 
with all g; € G. The group G acts on F,, by automorphisms, the action on the 
members of the Z basis being 


&( 80, +++» 8n) = (8805 -- +5 B8n)- 


The universal mapping property of ZG then allows us to regard each F, as a 
unital left ZG module. 


Lemma 3.19. For n > 0, the left ZG module F,, is a free ZG module with 
ZG basis consisting of all (n+1)-tuples (1, g1,..., 8), 1.e., all Z basis elements 
with go = 1. 


PROOF. The formula go(1, 8 81> bak 25 84) = (g0, £1,---» &n) shows that 
all members of the Z basis defining F;, are ZG images of the asserted ZG basis; 
hence the asserted ZG basis is a spanning set of F,, relative to ZG. Suppose 
that there are finitely many distinct members h; of G and finitely many distinct 
(n+1)-tuples (1, gi.1,..., Zin), and members ae njjh; of ZG such that 


Oy) Bie Ra =O. 
J 


i 


Then Se ee ee) =0. 

ij 
Since the h;’s are distinct as j varies and the n-tuples (g;1,..., in) are distinct 
as i varies, the (1+1)-tuples (;, hjgi1,.-., 4jgi,,) are distinct as the pair (i, /) 


varies. Thus the Z independence implies that n;; = 0 for alli and j. This proves 
the lemma. 


5. Digression on Cohomology of Groups 149 


For n > 1, we define 0,_| : F;, — F,—; as a function from the Z basis into 
Fy-4 by 


n 


An—1 (8006+ + Bu) = > (1)! (Bos 0265 Bis sos Bn) 
i=0 


where the symbol ~ indicates an expression to be omitted. We extend 0,,_ to all 
of F,, by the universal mapping property of free abelian groups. For g in G and for 
any Z generator x of F;,, it is evident that 0,_-1(gx) = g(0n_1(x)). Since d,—1 is 
a homomorphism of abelian groups, the formula 0,,_;(gx) = g(0,_-1(x)) extends 
to all x’s in F,. Since G and Z generate ZG, we obtain 0,_,(rx) = r(0n_1(X)) 
forallr € ZG and allx € F,,. In other words, each 0,,_; is aZG homomorphism. 

We shall make use of one additional ZG homomorphism. According to Lemma 
3.19, the ZG module Fo is free on the ZG basis {(1)}. Let us think of the group G 
as acting trivially by automorphisms on the abelian group Z. Under this action, 
Z becomes a ZG module. Define ¢ : Fo — Z to be the ZG homomorphism with 
e((1)) = 1. Then e((go)) = go(e(1)) = go- 1 = 1 for all g9 € G. The ZG 
homomorphism ¢ is called the augmentation map. 


Theorem 3.20. If G is any group, then the sequence 


Beit a 
> Fusy —> Fh Sree > Fo >Z > 0 


of left unital ZG modules and ZG homomorphisms is exact. 


REMARKS. The displayed sequence is called the standard resolution of Z in 
the category Czq. The proof will be preceded by two lemmas. 


Lemma 3.21. The sequence 


On+1 0, On-1 00 é 
> Fao, —> Fy — >: > Fo > Z, >0 


in Czg is a complex, ie., 0,10, = 0 for n > 1 and also ed = 0. 


PROOF. With the understanding that the symbol ~ indicates an expression to 
be omitted, we have 
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n 7 =~ 
On—19n (Zo, sees Bn) = ye (—1)'dn-1(80, sees Sirens Bn) 
i=0 


n i-1 
= YS (HD! YS (HK 1) (80, 1 Bj es Bis 9 Bn) 


i=0 j=0 
n 7 n 44 is! Pa 

+ >> (-1)' 3 (-1)/* (20, ++ +5 Bisse +s Bj ees Bn) 
i=0 f=rAl 

n i-l nn 3 ea & 

— Di ED Bis Bk Biac5 Bn) 

i=0 j=0 
n n . . Je x 

se De CSL) Gard Bie Bppccet Ba) 
i=0 j=i+1 


If we interchange the order of summation in the second double sum on the right, 
we see that the result equals the first double sum on the right. Thus the difference 
is O. 

This handles all the consecutive compositions except for ¢d9. For this we have 
£00(80, 81) = €(81) — €(8o) =1—-1=0. 


Lemma 3.22. Fix sin G. Forn > 0,defineahomomorphismhy : Fy > Fri 
of abelian groups to be the additive extension of the function with 


hn (go, sag 8n) = (s, 80> seek 9 8n)s 


and define h_; : Z > Fo by h_,(k) = k(s). Then 0,hy + hy_10,-1 = 1 for 
n > 1,and also dphp + h_je = 1. 


PROOF. On the Z basis of (n+1)-tuples in F;,, we have 
Onn (Zo, sees 8n) = dn(s, B0.-++5 8n) 


n 
— (go; oes Bn) + oS (-1)'t!(s, go, buy Bis aseeh) 
i=0 


and also 
n : wx 
hin 9nd (Boneh Bn) =D, Hl) Boss ses Bis Bw): 
i=0 
The sum of these is (go, ... , 8n), aS required. Also, 


doho(go) = 9o(s, 80) = (80) — (S) and h_e(go) =h_11 = (s). 


Thus dp/0(go) + A_-1€(80) = (go), and doho +h_jé = 1. 
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PROOF OF THEOREM 3.20. Lemma 3.21 gives imaged, C kerd,—1 and 
image 09 C kere. For the reverse of the first inclusion, let x € F, be given 
with 0,-;x = 0 andn > 1. Then Lemma 3.22 gives x = dyhyx + hy 10n_1X. 
The second term on the right side is 0, and therefore x = 0, (h,x) is in image 0,. 

For the reverse of the inclusion image 09 C kere, let x € Fo be given with 
ex = 0. Then Lemma 3.22 gives x = dohox + h_ ex. The second term on the 
right side is 0, and therefore x = d9(hox) is in image do. 


With the standard resolution of Z in Czg now known to be exact, we examine the 
effect of applying the functor Homzg(-, M) to it. This functor is contravariant 
and carries Czg to the category Cz of all abelian groups. On a unital left ZG 
module F’, this functor yields the abelian group Homzg(F, M). On a Z module 
homomorphism g : F — F’, it yields the homomorphism 


Hom(g, 1) : Homzg(F’, M) > Homzg¢(F, M) 


of abelian groups given by Hom(g, l)(W) = w og for Y € Homzg(F’, M). 
We know from Chapter X of Basic Algebra that this functor carries complexes to 
complexes but does not necessarily preserve exactness. 

Before applying Homzg(-, M) to the standard resolution of Z, it is customary 
to drop the term Z and the augmentation map, obtaining a modified sequence 


On+1 a, On-1 do 
-—> Frat —> Fy > Fy —> 0 


that is still a complex in Czg. Let us define d, = Hom(d,, 1). Then the result of 
applying Homzg(- , M) to the modified complex is the complex 


dn 
0 —> Homzg(Fo, M) —@> ---Homzg(F,, M) "> Homzg(Fr41, M) “5 


in Cz. To each g in Homzg (Fy, M), we associate f = ®(y) in C"(G, M) by 
the definition 


F (81, +++ 8n) = GCL, 81, 8182, +++, 81° Bn): 
Any member g of Homzg(Fn, M) is determined by its values on (n+1)-tuples 
(1, g1,.--, Zn), Since we can factor out the first entry of the argument of g and 
commute it past g, and it follows that the system of group homomorphisms 


®, : Homzg(F,, M) > C"(G, M) 


is a system of isomorphisms of abelian groups. Let 
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bn C"(G, M) > C"+1(G, M) 


be the map corresponding to d, : Homg(F,, M@) — Homg(Fr41, M) under 
this system of isomorphisms, namely 5, = ®n41 0 dy o ®,!. We can calculate 
6, explicitly as follows: If f = ®,(@), then 6, f = (ni 1dn®,!)(®n)(@) = 
®,41d,¢, and therefore 


(On f (81, --+5 Sng) = (ne), 81, 8182, ---, 81+ Sng) 
= P(r, 81, 8182,---5 81 °° Sn) 
= p(gZ1; 8182; cee » 81 ae  2n41) 


n : a 
+ >) (-1)'@d, 81, .--, 81-8 Bin Bt Sng) 
i=l 


+ (—1)"*" (1, 21,...5 81° Bn) 
= ai(f (g2, 835-55 » 8n+1)) 


ale ys (-1)' f(g1, PAS Bi xe 5 Bnti) 
i=l 


+ (-D"*" f(g1,.--5 8n)- 


Comparing this formula with the original formula defining 5, in Chapter VII of 
Basic Algebra, we get a match. That is, we have obtained the complex in Cz 
defining the usual groups H"(G, M) by applying Homzg(- , M) to the standard 
resolution of Z in Czg and implementing the system of isomorphisms ®,. In 
particular, we obtain a more conceptual proof than in Basic Algebra of the fact 
that the sequence 


OC KG SS eG) CG 


is a complex and that cohomology groups are therefore well defined. 
This completes the discussion of the first main point of the section as outlined 
in the overview at the beginning. Next, any exact sequence 


y 


a a, 
“> F’ ——+ 


On ! a / é 
> Fs > Fy > Z, > 0 


in the category Czg in which all ZG modules FY for n > 0 are free ZG modules 
is called a free resolution of Z in the category Czg. The second main point 
of the section is that if we apply the functor Homzg(-, M) to this sequence 
with Z dropped, then the consecutive quotients of kernels modulo images are 
canonically isomorphic to the cohomology groups H"(G, M) obtained above. 
Thus H”(G, M) can be computed from any free resolution of Z, and we are 
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not obliged to use the standard free resolution. This result is stated precisely as 
Theorem 3.3] below. 

By way of preparation, let us establish a slightly more general setting and work 
with it fora moment. Let Cr be the category of all unital left R modules, where R 
is any ring with identity. According to circumstances, a complex X in Cr might 
be written with decreasing indices as 


On+1 0, On-1 On—2 
Xe oes > Xng1 —> Xp — Xn — - 


or with increasing indices as 


Ga dn—1 d, dn+1 
xX: oe Xn > Xp —D Xn 


Mathematically these complexes amount to the same thing: if we rename each 
X, in the second complex as X_; and rename each d, as 0_,_,, then we obtain 
the first complex. However, it is convenient to allow both systems of indexing 
because of applications. 

For the first complex, which has decreasing indices, we define then" homology 
of X, written H,(X), by 


HA, (X) = (ker dn—1)/CGmage 0,). 


For the second complex, which has increasing indices, we define the n"™ coho- 
mology of X, written H"(X), by 


H"(X) = (kerd,)/(image d,_1). 


In both cases the integer n is called the degree. In either case the homology 
or cohomology is again a module in Cr. The condition that X be a complex is 
equivalent to the condition that the image of each incoming map be contained in 
the kernel of the corresponding outgoing map, and this is precisely the condition 
that the homology or cohomology be meaningful. Exactness at a particular 
module in one of the complexes is the statement that the image of the incoming 
map equals the kernel of the outgoing map. Thus the homology or cohomology 
of X measures the extent to which the complex X fails to be exact. 

Because the nature of the indexing of a complex is not mathematically sig- 
nificant, we will treat only the case of increasing indices for a while, and the 
modules associated to our complexes will therefore be cohomology modules. A 
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cochain map” between two complexes X and Y in the same category Cr is a 
system f = {fn} of R homomorphisms f, : X,— Y, such that the various 
squares commute in Figure 3.1. 


dn—2 dn-\ d, dn+1 
xX: Bip i ge “> Xna. —— 
[4-1 [+ [x 
dj» diy d, dn 
Y: sie SE Ve ea “> ¥nay — 


FIGURE 3.1. A cochain map f : X > Y. 


Proposition 3.23. A cochain map f : X — Y as in Figure 3.1 induces an R 
homomorphism on cohomology H"(X) — H"(Y) in each degree. 


PROOF. Suppose that x, is in ker d,, i.e., that d,(x,) = 0. The commutativity 
of the right square gives d)(fn(%n)) = fn4i(dn(%n)) = 0, and hence f(x,) is 
in kerd/,. Suppose that x, is in image d,_1, i.e., that x, = d,_1(%n_1) for some 
Xn—-1. The commutativity of the left square gives f,(%,) = fndn—1Qn-1) = 
d’ _,(fn—1(%n-1)), and hence f;, (xn) is inimage d)_,. Thenit follows that f;, eae 
descends to the quotient (ker d,)/(image d,_1), yielding a map of H”(X) into 
H"(Y). 


Suppose in the situation of Figure 3.1 that g = {g,} is a second cochain map 
of X into Y. We say that f is homotopic? to g, written f ~ g,if there is a system 
h = {h,} of maps h, : X, — Yn—1 in Cr such that dh + hd = f — g,ie., if 
d_jhn + hasidn = fn — Bn for all n. 


Proposition 3.24. In the situation of Figure 3.1 if f = {f,} and g = {gy} are 
two cochain maps of X into Y andif f and g are homotopic, then f and g induce 
identical maps H”(X) — H"(Y) in each degree. 


PROOF. Suppose that d,(x,) = 0. Then fn (%n) — 8n(n) = di _) (Ann) + 
hinai(dn(Xn)) = d!_,(An(Xn)) + 0 shows that the images of x, under f, and gy, 
in Y,, differ by a member of image d’,_,. 


Now we bring free R modules into the discussion. 


>The analogous kind of system in which the complexes have decreasing indices is called a chain 
map. 

3An analogous definition is to be made in the case of two chain maps. If the maps of X are 
On : Xnt1 — X, and the maps of Y are 0): Yn41 — Yn, then we are to have hy : Xn > Yn+1 with 
an + hn-10n-1 = Sn — &n- 
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Proposition 3.25. For the diagram 
a a 


F > M > N 
if [r [a 
Vv 

F' aan M’ 4 > N’ 


in Cr, suppose that the top and bottom rows are exact at M and M’, suppose 
that the square on the right commutes, and suppose that F is a free R module. 
Then there exists an R homomorphism f : F — F’ that makes the left square 
commute. 


Proor. If x is a free generator of F, then 0 = f\0;0(x) = 0;(fdx). By 
exactness at M’, fx lies in image(0"). Choose any y € F’ with d’y = fdx, and 
define f (x) to be this y. Then fdx = 0’ fx, and the left square commutes at x. 
The universal mapping property of free R modules says that f extends to an R 
homomorphism of F into F’, and the extension has fd = 0’ f, as required. 


Corollary 3.26. In the category Czg, if the rows of the diagram 


Dnt an, Dnt 9% e 
——> Xana > Xy > oes > Xo > Z > 0 
| | | | 
{i |” \* hk 
” w” a” a” e! 
aS ay "> Y, BS hs ee > Z > 0 


are free resolutions and the vertical identity map 1 : Z — Z is given, then the 
remaining vertical maps, 


fo: Xo > Yo, setdi5 tn i Xn Vn, Foti: Xnt1 > Yn4, eR 


can be constructed inductively from the right to make all the squares commute. 


REMARK. The resulting system f = {f,} is called a chain map over the 
identity map 1:Z— Z. 


PROOF. There is no harm in including a vertical 0 map at the right between 
the two 0 modules. Certainly the square whose verticals are the identity map 
1: Z— Zand the 0 map commutes. Proposition 3.25 is to be applied first to this 
square and the second square from the right (with vertical fo to be constructed and 
vertical 1 : Z — Z given) to construct fo, then to the second and third squares 
from the right to construct f;, and so on, inductively. 
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Proposition 3.27. For the diagram 


F > F > N 

[7 i [r va [a 

F > F > N 
a a1 


in Cr, suppose that the top and bottom rows are exact at F,, that the left and right 
squares commute, that F and F are free R modules, and that h, : N si F exists 
with f; — 0,h; vanishing on image(0;). Then there exists h : F — F such that 
dh + ho; = f and this property implies that f — dh vanishes on image(d). 


PROOF. If x is a free generator of F ,then f (x) —h, (0) (x)) is inker(0,) because 
a1(fx = h,0\x) = fox = 01h, 0,x = (fi = 01h) (0x) and because fi = 01h, 
vanishes on image(0;) by assumption. Therefore f(x) —h (0; (x)) isinimage(d), 
and we can write f(x) — h,(0\(x)) = da for some a € F.. Put h(x) =a. Then 
dhx = 0a = fx — h,0\x, and h has the required property on the generator x. 
The universal mapping property of the free R module F allows us to extend h to 
an R homomorphism h : F — F, and the extension satisfies 0h = f — hy0,. 
Once h has this property, then necessarily (f — 0h)0 = (h,0,)0 = h,(0,0) = 0. 


Corollary 3.28. In the category Czg, if a free resolution X = {X,} of Zanda 
chain map f = {f,} of X with itself are given such that the map from Z to itself 
is 0, then the chain map f is homotopic to the zero chain map g = {g,} with 
8n = 0 forall n. 


PROOF. We are given the diagram 


ay a e 
> Xy > ee > Xy > Xo > Z > 0 
Pi | a | | 
[« Ra |" Pa \* ho Ab 
L L 
3 9 é 
> Xy >see > X > Xo > Z > 0 


in the category Czg with the two rows as free resolutions and all squares com- 
muting. We are to construct maps hy : X, > Xn+1 with dfhy+hy_-10)_) = fn. 
Let h_» be the 0 map from the top 0 module to the bottom Z, and let h_; be the 0 
map from the top Z to the bottom Xo. Then 0/hy + An—10)_, = fn is satisfied 
for n = —1 because the map f_j is the 0 map from Z to itself. Proposition 3.27 
then allows us to construct inductively first ho, then h,;, then hz, and so on. 
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Corollary 3.29. In the category Czg, if a free resolution X = {X,} of Zanda 
chain map f = {f,} of X with itself are given such that the map from Z to itself 
is the identity 1, then the chain map f is homotopic to the identity chain map 
g = {g,} with g, = 1 foralln. 


PROOF. Apply Corollary 3.28 to f — 1. 


Corollary 3.30. In the category Czg, if two free resolutions X = {X,,} of Z 
and Y = {Y,,} of Z are given and if two chain maps f : X — Y andg:Y > X 
are given such that the map from Z to itself in each case is the identity 1, then gf 
is homotopic to | and fg is homotopic to 1. 


PROOF. Apply Corollary 3.29 to fg and then to gf. 


Theorem 3.31. If 


ay / n-1 05 , € 
atl > FF, > > Fo >Z >0 


is any free resolution of Z in the category Czg and M is a unital left ZG module, 
then H"(G, M) is canonically isomorphic to the n cohomology group of the 
complex in Cz given by 


0! / , 
n+l / 
> F 


dn 
0 —> Homzg (Fi, M) > ---Homzo(F/, M) > Homzg(F/,;, M) 5 


with d, = Hom(d/,, 1) forn > 0. 

PROOF. Let the resolution in the statement of the theorem be Y , and let X be the 
standard free resolution of Z in the category Czg. Two applications of Corollary 
3.26 produce chain maps f : X — Yandg: Y — X over1: Z — Z. Corollary 
3.30 shows that gf is homotopic to 1 = lx and fg is homotopic to 1 = ly. 
Apply the functor Homzc(-, M@) throughout, including to the members of the 
homotopies. Then we obtain chain maps 


Homzg(f, 1) : Homzg(Y, M) — Homzg (xX, M) 
and Homzg(g, 1) : Homzg(X, M) — Homzg(Y, M) 
with 
Homzg(f, 1) o Homze(g, 1) homotopic to 1 
and Homzg(g, 1) o Homzg(f, 1) homotopic to 1. 
Proposition 3.24 allows us to conclude that 
Homzc(f, 1) oHomze¢(g, 1) induces the identity on H*(Homzg(X, M)) 
and 
Homzg(g, 1) oHomze(f, 1) induces the identity on H*(Homzg(Y, M)). 


Thus Homzg(g, 1) induces an isomorphism of each group H” (Homzg(X, M)) 
onto H”(Homz,(Y, M)). 
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6. Relative Brauer Group when the Galois Group Is Cyclic 


This section has two parts to it. The first part specializes Theorem 3.31 to compute 
group cohomology when the group in question is cyclic of finite order. The second 
part applies this computation to H?(Gal(K/F), K*) and obtains information 
about Brauer groups. As a consequence we obtain new information about the 
classification of noncommutative division algebras. 

Let G be a finite cyclic group of order n. Theorem 3.31 says that if G acts by 
automorphisms on an abelian group M, then H?(G, M) can be computed from 
any free resolution of Z in the category Czg. The standard resolution of Z is one 
such resolution. We shall construct another such resolution that is special to the 
case of G cyclic and that makes the cohomology more transparent. 

Let G = {1,s,57,...,5”~!}. Lemma 3.19 notes that the free abelian group 
on the 1-tuples (1), (s), (s?),... , (s"7!) is a free ZG module with ZG basis (1). 
In other words, the elements of the left ZG module ZG may be identified with 
the integer linear combinations of these 1-tuples. Define two operators T and N 
from the left ZG module ZG into itself by 


T = multiplication by (s) — (1), 
N = multiplication by (1) + (s) +--+ (s""1). 


Each of these respects addition and commutes with multiplication by (s), hence 
isa ZG module homomorphism. We shall compute the kernel and image of each. 

The kernel of 7 consists of all elements for which left multiplication by (s) 
fixes the element. The elements of ZG are of the form Be; Cj (s/), and (s) times 
this gives c,_,(1) + or cj-1(s/). Since (1), (s),... , (s"~!) form a Z basis, 
the condition to be in the kernel of T is that c,_1 = cp = Cc] = +++ = Cy_2. Thus 


kerT = {c((1) + (s) +--- +(8""')) |e € Zh. 


Also, 
image T = {integer polynomials in (s) divisible by (s) — (1)} 
= {integer polynomials equal to 0 when s is set equal to 1} 


= [Saw ve =o}. 


J 


In the case of the operator N, we have N(s/) = (1) + (s) +--+ + (s"—!), and 
therefore N( >>, cj(s/)) = 30, c)((L) + (s) +++ + ("7!)). Hence 


n—1 n—-1 
ker N = | ese =o} = image T, 
j=0 j=0 


image N = {c((1) + (s) +--- +. (8""')) |c € Z} =kerT. 
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An immediate consequence of this and a supplementary argument concerning the 
augmentation map is the following proposition. 


Proposition 3.32. If G is a finite cyclic group, then the sequence 


> ZG > ZG —— > ZG > ZG > ZG >Z > 0 


is a free resolution of Z in the category Czg. 


PROOF. We still need to check exactness at the first ZG from the right. The 
map ¢é is the ZG homomorphism with ¢((1)) = 1. Hence e((s/)) = 1 for all j, 
and e( ee, Cj (s/)) = ye cj. Thus kere = ker N = image T, and exactness 
is proved. 


Corollary 3.33. If G is a finite cyclic group and M is an abelian group on 
which G acts by automorphisms, then 


H?(G, M) = M&/((1) + (s) ++ + (8"71)) M, 


where M° is the subgroup of all elements of M fixed by G. 


PROOF. Let us number the terms ZG in the resolution of Proposition 3.32 
starting with index 0 from the right. Combining Proposition 3.32 with Theorem 
3.31, we see that we may compute H*(G, M) as the cohomology of the complex 
obtained by applying the functor Homzg(-, M) to the terms with indices 1, 2, 3 
in the resolution in Proposition 3.32. Thus H?(G, M) is the cohomology at the 
middle of the complex 


Homzg (ZG, M) 22% Homzg(ZG, M) 2 Homzg (ZG, M). 


The mapping a + a((1)) of Homzg(ZG, M) into M is one-one and onto, and 
we can identify members a of Homzg(ZG, M) with the corresponding elements 
a((1)) accordingly. If @ is in ker ((-) ° T), then a(T ((1))) = 0, and we thus have 
a((s)) = a((1)) and (s)a((1)) = a((1)). Hence a((1)) is in M°. These steps 
can be reversed, and thus ker (() ° T) = M°. If f isin image ((-) ° N), then 
B =aoN for some a € Homzg(ZG, M), and thus 


B((L)) = a((1) + (s) +++ +(s"-1)) = a((1)) + (sa@((D) +++ + (5 oe((1)). 


Since a((1)) is acompletely arbitrary element of M, we see that image (() oN ) = 
((1) + (s) +--+ + (8"7!))M, and the result follows. 
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Now we specialize to the Galois case that has occupied our attention in this 
chapter. Let K /F be a finite Galois extension of fields. We are going to set G = 
Gal(K /F),n = dimr K,and M = K”%. To take advantage of Corollary 3.33, we 
suppose that Gal(K /F) is cyclic. Then M° = (K*)° = F™. If x is an element 
of K*, then the orbit Gx is {x, sx,s?x,...,5"~!x}. Remembering that we are 
using additive notation in working with cohomology of groups and multiplicative 
notation in working with K *, we see that the element ((1) + (s) tee + G"=")) 
of ZG is to be regarded as operating by giving the product of the members of an 
orbit in K*. This product for the orbit of x € K* is Nx/r(x), and Corollary 
3.33 thus specializes to the following result. 


Corollary 3.34. If K/F is a finite Galois extension of fields such that 
Gal(K /F) is cyclic, then 


H?(Gal(K /F), K*) = F* /Nx/r(K*). 


Corollary 3.34 considerably simplifies the proofs of Frobenius’s Theorem 
about division algebras over the reals (Theorem 2.50) and Wedderburn’s Theorem 
about finite division rings (Theorem 2.48), and thus the theory in Chapter III has 
added something to the theory of Chapter II even in these very special situations. 
In the case of the Frobenius theorem, the only nontrivial algebraic extension of 
R is C, and thus Theorem 3.14 and Corollary 3.34 give 


B(R) = B(C/R) = H?(Gal(C/R), R*) 
= R*/Nea(C*) = R*/(R*)t = Z/2Z. 


Hence the reals and the quaternions are the only finite-dimensional central simple 
division algebras over R. 

In the case of the Wedderburn theorem, suppose that a finite field K splits a 
central division algebra over a field F with g elements. Say that |K| = q”. For 
finite fields the Galois groups are always cyclic, and thus Gal(K /F) is cyclic of 
order n, generated by the map x +> x47. In view of Corollary 3.34, the Wedderburn 
theorem follows if F'* / Nx/r(K”%) is shown to be trivial, i.e., if the norm map 
Nxjr : K* — F™% is onto. The group K” is cyclic, say with a generator xo of 
order g” — 1. Since the norm of an element is the product of the images under 
the Galois group, the norm of xo is given by 


2 n-1 1 Hy n—-1 = 
Nx/F(XxXo = xoxi x4 re = x 1dr +9 — q Ve 
/ 00 0 0 


This has order qg — 1, not less, and thus is a generator of F*. Thus the norm map 
is onto F*. 
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For a more difficult example that we can settle completely, consider the case 
that F = Qand K = Q(./m) for a square-free integer m other than 1. The 
Galois group in this case is a 2-element group and is in particular cyclic. Thus 
Corollary 3.34 applies. The norm of the member x + y./m of K, where x 
and y are in Q, is x? — my”. The problem of determining the quotient group 
F* / Nx /Q(K *) may be rephrased in terms of genera as in Section I.5. Specifi- 
cally the field discriminant D is defined to be m if m = 1 mod 4 and to be 4m if 
m # 1 mod 4. A genus for Q(,/m ) is an equivalence class of primitive quadratic 
forms ax* + bxy + cy? whose discriminant matches the field discriminant D, 
except that the theory of Chapter I discards all negative definite forms. Equiv- 
alence is determined by the action of SL(2, Q). Lemma 1.13 shows for D > 0 
that each nonzero rational number is a value taken on by the members of one and 
only one genus at points (x, y) # (0,0) with x and y both rational; for D < 0, 
Lemma 1.13 applies to positive definite forms and positive rational numbers. Let 
us now enlarge the definition of genera to include negative definite forms and 
negative rational numbers when D < 0. 

The definition of the multiplication of classes of forms is set up so as to 
be compatible with multiplication of the values of the quadratic forms, and the 
genera define a group, the identity element being the principal genus. Since 
a representative of the principal genus is x7 — my7, the nonzero rational val- 
ues corresponding to the principal genus are exactly the members of the group 
Nx/o(K*). Consequently the quotient group F'* / Nx/Q(K“) is isomorphic to 
the group of genera.* The easy result concerning the group of genera is Theorem 
1.14, which says that this group is finite abelian and that every nontrivial element 
has order 2; since B(K /F) = F* /Nxjo(K *), Corollary 3.15 gives another way 
of seeing that every nontrivial element has order 2. The hard result, which appears 
in Problems 25—29 at the end of Chapter I, identifies the order of the group of 
genera explicitly. If D > 0, then the order of the group of genera is 28 , where 
g’ + 1 is the number of distinct prime divisors of D; if D < 0, then the order of 
the group of genera is 2°+!, 

Consequently if m has g + 1 distinct prime divisors, then the relative Brauer 
group is a product of 2-element groups whose order is given by 


28 ifm > Oand m 43 mod 4, 
28t+l ifm > Oandm = 3 mod 4, 
28+! ifm <Oandm 43 mod 4, 


28+2 ifm <Oandm = 3 mod 4. 


|BQ<¢(/m )/Q)| = 


4With the understanding that genera from negative definite forms are to be allowed if D < 0. 
In quoting this result, we are now making allowances for genera corresponding to negative 
definite forms. 
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The example with K /Q quadratic shows the kind of information that has to go 
into a complete determination of the relative Brauer group when K /Q is Galois. 
Showing that a relative Brauer group is nontrivial in a case with Gal(K /Q) cyclic 
is considerably easier. According to Corollary 3.34, all one needs to know is that 
the norm function does not carry K* onto Q*, and congruence conditions can 
be used as a first step in addressing this question; Problem 4 at the end of the 
chapter illustrates this principle. Problems 15-17 at the end of Chapter II give 
a construction in this situation of nontrivial central simple algebras over Q that 
are split by K , and such algebras whose dimension is the square of a prime are 
necessarily division algebras. Problems 6—12 at the end of the present chapter 
give a sufficient condition for obtaining a division algebra when the dimension is 
not the square of a prime. 


7. Problems 


1. Let A be a finite-dimensional central simple algebra over a field F, let K be a 

subfield of A, and let B be the centralizer of K in A. 

(a) Arguing as in the proof of Theorem 3.3, exhibit a one-one algebra homo- 
morphism A @r K — Endg A. 

(b) Referring to the proof of Theorem 2.2 and counting dimensions with the aid 
of the Double Centralizer Theorem, prove that the mapping in (a) is onto 
Endge A. 

(c) Deduce that A @- K and B yield the same member of B(K). 


2. Leta = a(o,T) be a 2-cocycle in Z?(Gal(K/F), K*), where K/F is a finite 
Galois extension of fields. Prove for each t that Toecatx /F) a(o, T) liesin F*. 


3. Let K/F bea finite Galois extension of fields with Gal(K /F’) cyclic. Corollary 
3.34 identifies H4(Gal(K /F), K*) for g = 2. Identify this group for all other 
values of g > 0. 


Problems 4—5 amplify the discussion of cyclic algebras that was begun in Problems 
17-19 at the end of Chapter II. Problem 4 in effect produces an explicit division 
algebra of dimension 9 over Q, and Problem 5 hints at the existence of an explicit 
division algebra of dimension n” over Q for each integer n > 1. 


4. Lett =e?7'/7 and let K = Q(¢) NR. 
(a) Show that K /Q is a Galois extension of degree 3, that a basis for K over 
Q consists of t) = 6 +071, m = 07+ 07%, 73 = 07 +073, and that the 
Galois group permutes 71, T2, 73 cyclically. 
(b) Show that if a, b, c are in Q, then 
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Nx/q(ats + bt) + ct3) = abce(t} + t3 + 73) 
+ (a? Hie cig 3abc) tT 1273 
+ (a*b +.ac* + b?c)(t/ + T373 + T3T1) 
+ (a*c +ab* + be?) (ty t3 + TS + 13T)). 
(c) Verify the following identities: 
JWtonot+pn=-l, 
NMN=MTBR, WR=M+3, 3 = + 72, 
t? = 1) +2, = 1342, tH =) 4+2. 
(d) Combine (b) and (c) to show that 
Nx/o(at +bt +13) = (a2 +b? +3) — abe 
+ 3(a7b Avge? 2 bc) _ A(ac +ab? + bc?). 
(e) Under the assumption that a, b, c are integers with GCD(a, b, c) = 1, show 
that Nx/g(at, + bt2 + ct3) #0 mod 3. 
(f) Deduce from (e) that r = 3 is not in Nx/g@(K*). (Educational note: Conse- 


quently Problems 18-19 at the end of Chapter IT produce an explicit division 
algebra over Q of dimension 9.) 
5. (a) Show for each integer n > | that there exists a prime p such that n divides 
p-l. 
(b) Deduce for this p that there exists a field L with Q C L C Q(e?”"/”) such 
that the field extension L/Q is a Galois extension whose Galois group is 
cyclic of order n. 


Problems 6—12 continue the discussion of cyclic algebras that was begun in Problems 
17-19 at the end of Chapter II and continued in Problems 4—5 above. Let F be any 
field, and let K be a finite Galois extension of F whose Galois group G = Gal(K/F) 
is cyclic of order n. Let o be a generator of G, fix an element r 4 0 in F, and let A 
be the subset of matrices in M,,(K) of the form 


Cl c2 C3 Rives: Ch 
ro(Cn) a (e1) o(c2) +++ O(Cn-1) 
ror(Cht) TO? (Cy) 07 (C1) + + + 07 (n-2) 
ro”! (c2) ra"\(c3) ra” H(c4) -- - o"™ N(c1) 
Identify c € K with the diagonal member of A for which cy = candcz = --- =c, = 
0, and let j be the member of A for which c} = 0, cz = 1,andc3 =--- =c, = 0. 


Under this identification every member of A has a unique expansion as )°7_ | Cx j aa 


with all c, in K, and the element j satisfies j” = r and jcj~! = o(c) forc € K. 
Take it as known that A is a central simple algebra over F of dimension n*. This 
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series of problems leads in part to another theorem due to Wedderburn. (However, a 

more direct proof of the theorem of Wedderburn without the other results is possible.) 

6. Inthe construction of factor sets in Section 2, use x,* = Fld forO <k<n-1. 
Show that the algebra A above corresponds to the 2-cocycle a with 


1 ifk+l<n, 


a(o",o') = | 
r ifk+l>n. 

7. Under the assumption that r = Nx ;r(x) with x € K™~, show that the choice 

Cok = xa (x)o7(x)---o*—!(x) exhibits the factor set of the previous problem as 


a trivial factor set and hence shows that A = M,,(F). 


8. Let F = {F;,} be the standard free resolution of Z in Czg, and let X = {X;,} 
be the free resolution of Proposition 3.32. The latter has X, = ZG for every 
k > 0. Trace through the proof of Corollary 3.26, and show that the proof allows 
achain map f = {f,} to be defined in such a way that the values of fo, fi, fo 
on standard ZG basis elements of Fo, F), Fo are fo(1) = 1, fidl, o*) = 
—(l+to+---+o!) for0 <k <n, and 

fll kg!) 0 ifO<k<l<n, 

,0 ,0)= 

: ol if0<l <k<n. 
9. Let ®) : Homgg(Fx, K*) > C?(G, K*) be the isomorphism of Section 5, 
and let y be in Homzg(ZG, K*). Show that the member of C?(G, K*) that 

corresponds to w is ®2(W o f2) and that 


v0) ifk +l <n, 


k ly _ 
Path 0 f2)(o", 0°) -| wot)! ifk+l>n. 


10. Let y be a member of K *, and let w be the unique element of Homzg(ZG, K”*) 
with w(1) = y. Why in the context of Proposition 3.32 is yw a 2-cocycle if and 
only if y isin F*? 

11. Take yas in the previous problem with y(1) = r~!, and show that the member of 
C 2G, K%) that corresponds to it under Problem 9 is the factor set a of Problem 6. 


12. Deduce from the previous problem that the order of the Brauer equivalence class 
in B(K /F) is the order of the coset of r in F* /Nx/r (K*). Why does it follow 
that A is a division algebra over F if the coset of r in F* /Nxjr(K*) has exact 
order n? (Educational note: This result is a theorem of Wedderburn except that 
it is here dressed in more modern language. The special case that n is prime 
was already handled by Problems 18-19 at the end of Chapter IT. Although the 
converse was seen in those problems to be valid for n prime, the converse is 
known to fail for n = 4.) 


Problems 13-20 introduce the reduced norm of a central simple algebra and give an 
application. Let A be a central simple algebra over a field F with dime A = n?. For 
a in A, the algebra polynomial of a is defined to be the characteristic polynomial 
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det(X 1 — A) of the F linear mapping L(a) : A — A given by the left multiplication 
x +» ax. This monic polynomial lies in F[X] and has degree n?. The ordinary 
norm N4/;(a) is defined to be (-1)” times the constant term, and the ordinary 
trace Tr4/(a) is defined to be minus the coefficient of X wl, these functions of a 
take values in F. Choose a finite Galois extension K of F that splits A, and fix an 
isomorphism g : A@pr K — M,(K). The reduced polynomial of a is defined to be 
the monic polynomial det (p(X 1-—a® 1). This polynomial lies in K [LX] and has 
degree n. The reduced norm Nrd4/ (a) is defined to be (—1)” times the constant 
term, and the reduced trace Trrd,4/7(a) is defined to be minus the coefficient of 
X"~!; these functions of a initially take values in K. 


13. Prove that the reduced polynomial of a does not depend on the choice of the 
isomorphism @. 

14. Prove that det(X1 — a) = det (p(X1 —a® By ae 

15. Using Galois theory and unique factorization, prove that any monic polynomial 
P(X) in K[X] such that P(X)” lies in F[X] already lies in F[X]. Conclude that 
the reduced polynomial of any element of A is in F[X]. 

16. Prove that det (p(X 1-—a® 1)) does not depend on the choice of the Galois 
extension K of F that splits A. 


17. Deduce that Nrd,;r is a function from A to F such that Nrdg;r(ab) = 
Nrd4/r(a@)Nrd,/r(b) for all a and b in A, Nrdg;r(1) = 1, and Nrda/r(a)” = 
Na;r(@) for all a in A. How does it follow that 
(a) anelement a € A is invertible if and only if Nrd4;-(a) 4 0 and 
(b) A is a division algebra if and only if Nrd4;-(a) = 0 only for a = 0? 


18. Let K/F bea finite Galois extension of fields, put G = Gal(K /F), and suppose 
that a crossed-product algebra A = A(K, G, a) is given as in Proposition 3.12 
with K C A and with dime A = (dime K)* = n?. Let {x, | o € G} be the 
system in the proposition such that A = @,-¢ Kxo. Associate a matrix m(v) 
in M,(K) to each v € A as follows. The rows and columns of the matrices are 
indexed by G, and E,,, denotes the matrix that is 1 in the (o, tT) entry and is 0 
elsewhere. Let m(cx,) = ye o(c)a(o, T)Eo,o1 forc € K,andextend additively 
to handle all v € A. Check that v +> m(v) isaone-one F algebra homomorphism 
of A into M,,(K), and prove that Nrd4;-(v) = det m(v). (Educational note: Thus 
by Proposition 3.12 the matrix algebra in Problems 6—12 is central simple.) 


19. Identify the norm and the reduced norm for the real algebra H of quaternions. 


20. A field F is said to satisfy condition (C1) if every homogeneous polynomial 
of degree d inn variables with d < n has a nontrivial zero. Using the reduced 
norm for a central division algebra over F,, prove that condition (C1) implies 
that B(F') = 0. (Educational note: Algebraically closed fields and finite fields 
satisfy (C1), the latter by a theorem of Chevalley. A deeper fact is that a simple 
transcendental extension of an algebraically closed field satisfies (C1).) 


CHAPTER IV 


Homological Algebra 


Abstract. This chapter develops the rudiments of the subject of homological algebra, which is an 
abstraction of various ideas concerning manipulations with homology and cohomology. Sections 
1-7 work in the context of good categories of modules for a ring, and Section 8 extends the discussion 
to abelian categories. 

Section | gives a historical overview, defines the good categories and additive functors used in 
most of the chapter, and gives a more detailed outline than appears in this abstract. 

Section 2 introduces some notions that recur throughout the chapter—complexes, chain maps, 
homotopies, induced maps on homology and cohomology, exact sequences, and additive functors. 
Additive functors that are exact or left exact or right exact play a special role in the theory. 

Section 3 contains the first main theorem, saying that a short exact sequence of chain or cochain 
complexes leads to a long exact sequence in homology or cohomology. This theorem sees repeated 
use throughout the chapter. Its proof is based on the Snake Lemma, which associates a connecting 
homomorphism to a certain kind of diagram of modules and maps and which establishes the exactness 
of a certain 6-term sequence of modules and maps. The section concludes with proofs of the crucial 
fact that the Snake Lemma and the first main theorem are functorial. 

Section 4 introduces projectives and injectives and proves the second main theorem, which 
concerns extensions of partial chain and cochain maps and also construction of homotopies for 
them when the complexes in question satisfy appropriate hypotheses concerning exactness and the 
presence of projectives or injectives. The notion of a resolution is defined in this section, and the 
section concludes with a discussion of split exact sequences. 

Section 5 introduces derived functors, which are the basic mathematical tool that takes advantage 
of the theory of homological algebra. Derived functors of all integer orders > 0 are defined for any 
left exact or right exact additive functor when enough projectives or injectives are present, and they 
generalize homology and cohomology functors in topology, group theory, and Lie algebra theory. 

Section 6 implements the two theorems of Section 3 in the situation in which a left exact or right 
exact additive functor is applied to an exact sequence. The result is a long exact sequence of derived 
functor modules. It is proved that the passage from short exact sequences to long exact sequences 
of derived functor modules is functorial. 

Section 7 studies the derived functors of Hom and tensor product in each variable. These are 
called Ext and Tor, and the theorem is that one obtains the same result by using the derived functor 
mechanism in the first variable as by using the derived functor mechanism in the second variable. 

Section 8 discusses the generalization of the preceding sections to abelian categories, which are 
abstract categories satisfying some strong axioms about the structure of morphisms and the presence 
of kernels and cokernels. Some generalization is needed because the theory for good categories is 
insufficient for the theory for sheaves, which is an essential tool in the theory of several complex 
variables and in algebraic geometry. Two-thirds of the section concerns the foundations, which 
involve unfamiliar manipulations that need to be internalized. The remaining one-third introduces an 
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artificial definition of “member” for each object and shows that familiar manipulations with members 
can be used to verify equality of morphisms, commutativity of square diagrams, and exactness of 
sequences of objects and morphisms. The consequence is that general results for categories of 
modules in homological algebra requiring such verifications can readily be translated into results for 
general abelian categories. The method with members, however, does not provide for constructions 
of morphisms member by member. Thus the construction of the connecting homomorphism in the 
Snake Lemma needs a new proof, and that is given in a concluding example. 


1. Overview 


This chapter develops the rudiments of the subject of homological algebra. The 
only prerequisite within the present volume is the self-contained Section III.5 
entitled “Digression on Cohomology of Groups,” which is helpful primarily as 
motivation. The definitions of category, functor, object, morphism, natural trans- 
formation, product, and coproduct as in Chapters IV and VI of Basic Algebra will 
be taken as known, and it will be helpful as motivation to know also the material 
from Chapter VII of Basic Algebra on group extensions and cohomology of 
groups. The present chapter will make some allusions to notions from algebraic 
topology, particularly in this first section, and the reader is encouraged to skip 
lightly over anything of this kind that might be an impediment to continuing with 
the remainder of the chapter. 

Homology and cohomology have their origins in attempts to assign algebraic 
invariants to topological obstructions. One example historically was the holes 
in a domain of the Euclidean plane that can make line integrals that are locally 
independent of the path fail to be globally independent of the path. Another was 
the handles on 2-dimensional closed surfaces. These obstructions were originally 
viewed as numbers (Betti numbers for example) and later viewed as algebraic 
objects such as abelian groups or vector spaces. A big advance was to regard 
them not just as objects attached to geometric configurations but as functors that 
attach objects to geometric configurations and also attach functions between such 
objects to reflect the behavior of functions between geometric configurations. 

Hints of connections with algebra on a deeper level and hints that homology and 
cohomology could be computed quite flexibly began with work of W. Hurewicz 
in 1936 and H. Hopf in 1942. Hurewicz considered the following situation: M 
is a finite connected simplicial complex, U is its universal cover, and G is the 
fundamental group of M. Suppose that U is contractible. The group G acts freely 
on the group C,(U) of simplicial chains of U (with integer coefficients). The 
boundary operator then gives us an exact sequence 


0<—Z<—Co(U) — C\(U) — Crx(U) <— --: 


of abelian groups with an action of G on each C;(U) by automorphisms in such 
a way that each C;(U) in effect is a free ZG module. Applying (-) ®zg Z, we 
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obtain the complex 

0 + Co(M) — Ci(M) — C2(M) <— --- 
The homology Ho(M) is just Z because M is connected, and H;(M) is just the 
quotient of G by its commutator subgroup; thus Hy)(M) and H,(M) depend only 
on G. What Hurewicz showed is that all higher H;(M) depend only on G; he did 
not address existence of such spaces M and U for G. 

Hopf clarified the situation and drew attention to it by making an explicit 
calculation: Dropping all assumptions on U other than its simple connectivity, 
he gave a formula for the quotient of H2(M) modulo the subgroup of “spherical 
homology classes” in terms of G. Later he obtained a result for higher-degree 
homology. In effect, Hopf was giving formulas for H,,(G, Z) by discovering and 
applying the homology analog of the cohomology result given as Theorem 3.31 
in Section III.5. 

Meanwhile, S. Eilenberg in 1944 made an adjustment to Lefschetz’s singular 
homology theory and showed for locally finite polyhedra that his adjusted theory 
gives the same groups as the more traditional simplicial theory. His method 
was to introduce a third complex, to exhibit chain maps from this to each of 
the complexes under study, and show that the chain maps possess inverses in a 
suitable sense. 

In addition to the people mentioned above, some others who pursued these mat- 
ters in the mid 1940s were R. Baer, B. Eckmann, H. Freudenthal, and S. Mac Lane. 
One thing that mathematicians gradually realized was that homology and coho- 
mology in various situations can be calculated from suitable kinds of abstract 
resolutions, a fact that lies at the heart of the subject of homological algebra. 
Another was that the subject of cohomology of groups made sense on an abstract 
level without any reference to topology and that the theory of factor sets for group 
extensions, as had been introduced by O. Schreier in the 1920s, was actually one 
aspect of this theory. 

With a great leap of generality, H. Cartan and Eilenberg set down such a theory 
in their celebrated book Homological Algebra, whose publication was delayed 
until 1956. Homology and cohomology became things attached to complexes, 
no longer dependent on topology, and the book developed enormous machinery 
for working with such complexes and homology/cohomology. By the time that 
Cartan and Eilenberg had published their book, other special cases of homological 
algebra had already arisen. One was the cohomology theory of Lie algebras, 
developed by C. Chevalley in the 1940s and by J.-L. Koszul in 1950. Another was 
the cohomology theory of sheaves, used in the subject of several complex variables 
starting about 1950 by K. Oka and H. Cartan; sheaves themselves had been 
introduced in 1946 by J. Leray in connection with partial differential equations. 

In the eventual theory the fundamental notion is that of a “derived functor”: 
homology or cohomology is obtained by starting from some kind of resolution, 
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or exact complex, passing to another complex by means of a functor with some 
special properties, and then extracting the homology or cohomology of the image 
complex. Two categories are thus involved, one for the resolution and one for 
the values of the functor. From an expository point of view, it seems wise to start 
with concrete categories and not to try to identify the most general categories for 
which the theory makes sense. For much of the chapter, we shall work with a 
category not much more general than the category Cp of all unital left R modules, 
where R is a ring with identity, and our functors will pass from one such category 
to another. Use of categories Cr subsumes the following applications: 


(i) manipulations with basic homology and cohomology in topology, in 
which one begins with the ring R = Z of integers. For more advanced 
applications in topology, one moves from Z to more general rings. 

(ii) homology and cohomology of groups, in which one initially uses group 
rings of the form ZG, where G is any group and Z is the ring of integers. 

(iii) homology and cohomology of Lie algebras. If g is a Lie algebra over 
a field such as C, then g has a “universal enveloping algebra” U(g) 
and a canonical mapping : : g — U(g). Here U(g) is a complex 
associative algebra with identity, 1 is a Lie algebra homomorphism, and 
the pair (U(g), 1) has the following universal mapping property: when- 
ever gy: g — A is a Lie algebra homomorphism into a complex asso- 
ciative algebra A with identity, then there is a unique homomorphism 
® : U(g) > A of associative algebras with identity such that gp = ® ot. 
Lie algebra homology and cohomology are the theory for the set-up in 
which the initial underlying rings are U(g) and C. 


In other words, in each of the three applications above, many derived functors of 
importance pass from the category Cr for a ring R with identity to the category 
Cs for another ring S with identity. 

The slight generalization of categories Cr that we shall use for much of the 
chapter is as follows: Let R be a ring with identity. A good category C of R 
modules consists of 


(i) some nonempty class of unital left R modules closed under passage 
to submodules, quotients, and finite direct sums (the modules of the 
category), 

(ii) the full sets Homa(A, B) of all R linear homomorphisms from A to B 
for each A and B as in (i) (the morphisms, or maps, of the category). 


For example the collection of all finitely generated abelian groups, as a subcate- 
gory of Cz, is a good category.! So is the collection of all torsion abelian groups, 


'One reason for working with this slight generalization is to emphasize that a certain property 
of categories Cr, namely that they have “enough projectives” and “enough injectives” in a sense to 
be made precise below in Section 5, does not necessarily persist for slight variants of Cr. 
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i.e., abelian groups whose elements all have finite order, as a subcategory of Cz. 

The definition of “good category” specifies /eft R modules that are unital. 
However, the theory applies equally well to right R modules that are unital, since 
a unital right R module becomes a unital left module for the opposite ring R°, 
i.e., the ring whose underlying abelian group is the same as for R and whose 
multiplication is given by ao b = ba. 

The special property of a functor F : C > C’ used for passing from a complex 
in one good category to a complex in another good category is that it is additive, 
namely that F(g; + g2) = F (gi) + F(@2) whenever g and ¢ are in the 
same Homr(A, B). The initial examples of additive functors are tensor product 
M ®r (-), which passes from Cr to Cz if M is aright R module, and Hom in 
each variable: Homr(-, M) and Homer(M, -), both of which pass from Cr to 
Cz if M is a left R module. In Section 2 we shall consider additive functors in 
more detail. 

The set-up with good categories does not subsume the cohomology of sheaves, 
nor some other applications of interest, such as the cohomology of vector bundles 
with a fixed base. The cohomology of sheaves is an important tool in algebraic 
geometry and several complex variables, and it cannot be ignored. Consequently 
one ultimately wants the theory to extend to other categories than good categories 
of modules. In addition, it is quite useful to have the theory work for the categories 
opposite to two given categories if it works for two given categories, and this 
feature means that the general theory should not insist that the objects be sets 
of elements and the morphisms be functions on such elements. Accordingly the 
abstract theory is carried out for “abelian categories;’ which will be defined in 
Section 8. The idea for creating the abstract theory is to take the theory for good 
categories of modules and rephrase all of the results for all abelian categories. In 
many instances the proofs will translate easily to the general setting, but in other 
instances it will be necessary to eliminate individual elements from arguments 
and obtain new arguments that rely only on complexes, exact sequences, and 
commutative diagrams. Some of this detail will be carried out in Section 8. 

Sections 2-3 establish the framework of homology and cohomology in the 
context of good categories of modules. Section 2 discusses complexes and exact 
sequences at length, and Section 3 shows how a short exact sequence of complexes 
leads to a long exact sequence in homology or cohomology. This is the first main 
result of the theory and finds multiple uses later in the chapter. 

Section 4 contains a discussion of “projectives and injectives” that expands and 
systematizes Theorem 3.31, which concerned the flexible role of resolutions in 
computing the cohomology of groups. Once that flexibility is in place in the more 
general setting of good categories, Sections 5—6 introduce derived functors and 
some of their properties. The main examples of derived functors at this stage are 
functors Ext(-, -) and Tor(-, -) obtained from Hom and tensor product; these 
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are examined more closely in Section 7. The example given in Section II.5 and 
now being used as motivation requires some subtlety to be regarded as a derived 
functor. That example was the system of functors H"(G, - ) yielding cohomology 
of the group G with coefficients in the module (- ); these were obtained in Section 
III.5 by applying the functor Homzg(-, M) to any free resolution of Z in the 
category Czg. It is seen in examples in Section 5 that the effect of using the free 
resolution was to compute H”(G, M) as Ext;,,(-, M) when the variable is set 
equal to Z; realizing this result as a derived functor in the M variable requires 
knowing that one gets the same result from Ext;,,(Z, -) when its variable is set 
equal to M. This conclusion is part of Theorem 4.31, which is proved in Section 7. 

The first seven sections complete the treatment of the rudiments of homological 
algebra in the setting of good categories. One more central technique beyond that 
of derived functors is the mechanism of spectral sequences, but we shall omit this 
topic to save space.” 

The chapter concludes with some discussion of abelian categories in Section 8. 
The foundations of homological algebra have to be redone completely when 
objects are no longer necessarily sets of elements. After this step, one introduces 
a substitute notion of “member” for elements, establishes its properties, and 
immediately obtains extensions of much of the theory to all abelian categories. A 
supplementary argument is needed whenever the theory for good categories uses 
an element-by-element construction of a homomorphism. 

Sheaves are introduced in the last section of text in Chapter X, and their 
cohomology is mentioned very briefly there. 


2. Complexes and Additive Functors 


Let C be a good category of R modules in the sense of Section 1. A complex in C 
is a finite or infinite sequence of modules and maps in C such that the consecutive 
compositions are all 0. There is no harm in assuming that the indexing for 
the sequence is done by all of Z, since we can always adjoin 0 modules and 0 
maps as necessary to fill out the indexing. The indices may be increasing or 
decreasing, and, as we saw in Section III.5, this distinction is only a formality. 
However, the distinction is very convenient when it comes to applications, since 
homology is normally associated with decreasing indices and cohomology is 
normally associated with increasing indices. 

Thus let us be more precise about the indexing. A chain complex in C is 


a sequence of pairs X = {(Xn, 0n)}P2_., in which each X, is a module in C, 

>For the reader who is interested in learning about spectral sequences, this author is partial to the 
explanation of the topic in Appendix D of the book by Knapp and Vogan in the Selected References. 
The setting in that appendix is limited to good categories of modules, and some important applications 
are included. 
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each 0, is a map in Homr(Xn41, Xn), and 0n0n41 = O for all n. The maps 
dn are sometimes called boundary maps, or boundary operators. We define 
the homology of X, written H,(X) = {H,(X)}°°_.. with subscripts, to be the 
sequence of modules in C given by 


H,(X) = (ker d,—1)/ (image d,). 


The members of the space ker 0,_; are called n-cycles, and the members of the 
space image 0, are called n-boundaries. 


EXAMPLES OF CHAIN COMPLEXES. 


(1) Simplicial homology. Let S$ be a simplicial complex of dimension N, and 
number its vertices. For each integer n, the group C,,(S) of simplicial n-chains 
is the free abelian group on the set of simplices of dimension n. This is 0 for 
n <Oandn > N. In elementary topology one defines the boundary of each 
n-simplex to be the member of C,,_;(S) equal to an integer combination of its 
faces, the coefficient of the face being (—1)! if the missing vertex for the face is 
the i" of the n + 1 vertices of the given n-simplex. This definition is extended 
additively to the boundary map 0n,—-1 : Cn(S) — Cn—1(S), and a combinatorial 
argument gives 0,0,—-1 = O for alln. Thus X = {(C,(S), dn—1)} is a complex. 
The associated homology H,,(X) is the n™ (integral simplicial) homology of the 
simplicial complex S and is usually denoted by H,,(S). 


(2) Cubical singular homology. Let S be a topological space. For n > 0, a 
singular n-cube in S is a continuous function T : 1” —+ S, where J” denotes the 
n-fold product of the closed interval [0, 1] with itself. The free abelian group on 
the set of n-cubes is denoted by Q,(S). A singular n-cube T is degenerate if 
its values are independent of one of the n variables. The subgroup of Q,,(S) 
generated by the degenerate singular n-cubes is denoted by D,(S), and the 
quotient C,(S) = Qn(S)/D,(S) is the group of cubical singular n-chains. 
One defines a boundary operator from Q,(S) to Qn—1(S) for each n in analogy 
with the definition in the previous example and shows that it carries D,(S) into 
D,—-1(S). Consequently the boundary operator descends to a homomorphism of 
abelian groups 0,-; : C,(S) — C,-1(S). A combinatorial argument shows that 
In Jn—1 = 0; thus we get a complex. The associated homology is the n™ (integral 
singular) homology of S and is usually denoted by H,,(S). 


(3) Free resolution of Zin Czg. Let G bea group. Then the standard resolution 
of Z in the category Czg, as given in Theorem 3.20, is a chain complex in that 
category. 


Letus make the class of chain complexes for the good category C into acategory. 
Each chain complex is to be an object. If X = {(Xy, 0,}) and X’ = {(X’, a’)} 


n? ~n 
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are two chain complexes in C, a morphism in Morph(X, X') is any chain map 
f = {fn}, defined as a sequence of maps f, € Homer(X,, X/,) such that the 
diagram 


commutes for all n. Briefly fo = 0’f. Since the f,,’s are functions, it is 
customary to use function notation f : X — X’ forchain maps. The system {1y, } 
of identity maps serves as an identity morphism, and coordinate-by-coordinate 
composition is associative. Thus the result is a category. 

The next step is to observe that homology H,,, as applied to chain maps for 
the category C, is a covariant functor from the category of chain maps to itself. 
The effect of the functor on objects is to send X to H,(X) = {(H,(X), 0)}. If 
f : X — X’ is achain map, then the formula 0’, (fn(%n)) = fn—1(On—1n)) 
shows that f,(kerd,-1) C kerd’_,, and the formula 0/(fn4iQ%n41)) = 
fn(On(X%n41)) shows that f, (image d,) C imaged). Therefore f, descends 
to the quotient, giving a map H(f,) : Hn(X) — H,(X’). The assembled 
collection of maps H,.(f) : H,(X) — H,.(X") is manifestly a chain map. Instead 
of writing H(f,) for the map induced by f, on the n homology, we shall often 
write (fn) Or rae especially in diagrams, to make the notation less cumbersome. 
Since the identity chain map yields the identity on H,,(X) and since compositions 
go to compositions in the same order, homology H,, is a covariant functor. 

If f : X — X’ and g : X — X’ are two chain maps, then a homotopy h 
of f to g is a system of maps h = {h,} increasing degrees by 1, ie., having 
hy carry X, into X},,,, such that Ay—10,-1 + 0,An = fn — Sn for all n. Briefly 
hd + 0/h = f — g. When such an hf exists, we say that f and g are homotopic, 
and we write f ~ g. This relation is an equivalence relation. 


Proposition 4.1. If f : X — X’ and g : X — X’ are homotopic chain maps 
in the good category C, then f and g induce the same maps H,.(f) and H,(g) 
on homology, i.e., H,(f) and H,,(g) are the same map of H,,(X) into H,,(X’) for 
each n. 


PROOF. Let h be a homotopy, and suppose that 0,-1(xn) = 0. Then the 
computation fn (Xn) — 8n(X%n) = Mn—10n—1 Xn) + 9) An (Xn) = O+ 0 An (Xn) shows 
that the images of x, under f, and g, in X/, differ by a member of image 0). 


Briefly let us translate all of these definitions and conclusions into statements 
when the complexes have increasing indices. A cochain complex in C is a 
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sequence of pairs X = {(Xy,dn)}?2_,, in which each X,, is a module in C, 
each d, is a map in Homr(Xn, Xn+41), and dn4id, = O for all n. The maps d, 
are sometimes called coboundary maps, or coboundary operators. We define 
the cohomology of X, written H*(X) = {H"(X)}P° _.. with superscripts, to 
be the sequence of modules in C given by H”(X) = (kerd,)/(imaged,_,). The 
members of the space ker d,, are called n-cocycles, and the members of image d,_; 


are called n-coboundaries. 


EXAMPLES OF COCHAIN COMPLEXES. 


(1) Singular cohomology. Let S be a topological space, let X = {(C,,(S), dn-1)} 
be its complex of cubical singular n-chains, and let M be any abelian group. If 
C"(S,M) = Homz(C,(S), M) and if d, : C"(S,M) > C"*!(S, M) is the 
map d, = Hom(d,4,, 1), then Y = {(C”(S, M)), d,)} is a cochain complex, 
and its cohomology, written H*(Y) = {H"(S, M)}, is the (integral singular) 
cohomology of S with coefficients in M. 


(2) Cohomology of groups. Let G be a group, and let M be an abelian group 
on which G acts by automorphisms. Let C”(G, M) be the abelian group of 
functions from the n-fold product of G with itself into M, the functions being 
added pointwise. Define 6, : C"(G,M) > c"*!(G, M) as in Section IIL5. 
Then X = {(C"(G, M), 6,)} is acochain complex, and its cohomology H*(X) = 
{H"(G, M)} is the cohomology of G with coefficients in M. 


The cochain complexes for the good category C form a category for which the 
morphisms from X = {(Xn,d,)}to X’ = {(Xj,, d))} arecochain maps f = { f,}; 
the latter are defined by the conditions that f, carry X, to X’, and fd = df,ie., 
Fntidn = dn fn for all n. Cohomology H*, as applied to cochain maps for the 
category C, is a covariant functor from the category of cochain maps to itself. 
The effect of the functor on objects is to send X to H*(X) = {(H"(X), 0)}, and 
the argument that a cochain map f : X — X’ carries H*(X) to H*(X’) viaa 
cochain map H*(f) is the same as for chain maps. Instead of writing H (f,,) for 
the map induced by /, on the nh cohomology, we shall often write (f,)* or ae 
especially in diagrams, to make the notation less cumbersome. 

If f : X — X' and g : X — X’ are two cochain maps, then a homotopy 
h of f to g is a system of maps h = {hy} decreasing degrees by 1, i.e., having 
hy, carry X, into X}_,, such that hn4id, + d)_ ihn = fn — Sn for all n. Briefly 
hd +d’'h = f — g. When such an h exists, we say that f and g are homotopic, 
and we write f ~ g. This relation is an equivalence relation. 


3The notation with the bar is to be avoided when there might be some ambiguity about which of 
homology and cohomology is involved. 
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Proposition 4.1’. If f : X — X' and g : X > X’ are homotopic cochain 
maps in the good category C, then f and g induce the same maps H*(f) and 
H*(g) on cohomology, i.c., H"(f) and H"(g) are the same map of H”(X) into 
H"(X’) for each n. 


PROOF. Let h be a homotopy, and suppose that d,(x,) = 0. Then the com- 
putation f,(%) — 8n(%n) = Ansidn (Xn) + d)_shy (Xn) =O+ dy _ hy (Xn) shows 
that the images of x, under f,, and g, in X), differ by a member of imaged’ _,. 


A chain or cochain complex written neutrally as X = {X (n)} is exact at X (n) 
if the kernel of the outgoing map at X (n) equals the image of the incoming map 
at X(n) (as opposed to merely containing the image). The complex is exact, or 
is an exact sequence, if it is exact at every X(n). A short exact sequence is an 
exact sequence of the form 


g v 


O-A>B->C—O, 


understood to have 0’s at all positions beyond each end. The conditions on the 
5-term complex above for it to be exact are that y be one-one, yy be onto C, and 
that y exhibit C as isomorphic to B/image gy. To make the terminology more 
symmetric, it is customary to introduce a name for the quotient of the range of a 
homomorphism 7 by the image of 7; this quotient is defined to be the cokernel 
of the homomorphism and is denoted by coker 7. The conditions for exactness 
above can then be restated more symmetrically as 


ker g = coker yy = 0 and image g = ker w. 


An exact sequence can always be broken into short exact sequences by stretch- 
ing each link 


nS es eM 3s 


into 
A, image 3 0 0S ker OS Boe ew 


and breaking it between the 0’s; here “inc” denotes the inclusion mapping of 
ker w into B. This stretching process does not take us outside our good category, 


since good categories are assumed to be closed under passage to submodules and 
quotients. Conversely if we have two exact sequences 


A= 5G and 0O—-C ane ce 
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then we can combine them into an exact sequence 


est RS 


Exactness at A of the merged sequence follows because ker(ig) = ker gy, and 
exactness at B follows because ker y = imagei = image(i¢). 

Any map g : A — B in our good category can be expressed in terms of an 
exact sequence by including the kernel and cokernel: 


0 — kerg SAS BL coker gy — 0; 


herei : kerg — Ais the inclusion, andg : B — coker ¢ is the quotient mapping. 
All the modules and maps in the exact sequence are in the category, since good 
categories are assumed to be closed under passage to submodules and quotients. 
We shall use the following special case of this observation in Section 3. 

Proposition 4.2. Let X = {(Xy, 0,)}"2_. be a chain complex in a good 
category with 0, in Homr(X;,41, X,) for each n. Then the boundary operator 
d,-1 on X, descends to the quotient as a mapping dn-1 . coker 0, —> ker d,_> 
and yields an exact sequence 


03H) > cokerd; Ske a,g > 4 OD) 0. 


Here i is the inclusion i : ker 0,_;/ imaged, — X,/ image 0,, and q is the quo- 
tient g : ker 0,2 — ker 0,_2/ image 0,_,. This association of a six-term exact 
sequence to X for each n is functorial in the sense that if X’ = {(X/,, 0/)}% _.. is 
a second chain complex and if f : X — X’ is a chain complex, then the diagram 


He "=> comer > kerdig 2s BG) 


| | | | 


ar 
n-1 


H,(X’) —> coker, —'> kera’_, —"> Hy-1(X’) 


commutes; here the vertical maps are those induced by f,- and fy. 


REMARKS. 

(1) The term “functorial” in the statement has a precise meaning in this and 
other contexts. Each chain complex is being carried to a 6-term exact sequence 
for each n. The chain complexes and the 6-term exact sequences both form 
categories, the morphisms in each case being chain maps. To say that the passage 
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from the objects of one category to the other is functorial is to say that the 
passage between the categories is actually a functor, i.e., chain maps for the chain 
complexes are sent to chain maps for the 6-term exact sequences, the identity goes 
to the identity, and compositions go to compositions. The latter two conditions 
are evident, and what needs proof is that chain maps are carried to chain maps.* 


(2) For a cochain complex X = {(Xy, d,)}°°_., with d, in Home(Xn, Xn41), 
the corresponding exact sequence is 


Oe Bin) Scokera 5 Ss herd Ss ee) =, 


and it is functorial with respect to cochain maps. 


PROOF. To see that the map dn—1 Carries coker 0, to ker 0,2, we write it as a 
composition 


coker 0, = X,/ image 0, > X,,/ ker 0,1 = image dn-1 C ker dn_2, 


with the arrow induced by the inclusion image 0, C ker 0,—; and with the iso- 
morphism induced by applying 0,1 to X, and passing to the quotient. Then we 
have ker 0,1 = ker 0,_1/ image 0, = H,(X) and 


coker 0,1 = ker In—2/On—1 (X,,/ image 0,) = ker In—2/ On—1Xn 
= ker 0,-2/ image 0,-1 = Hy_-1(X), 


and the exactness of the sequence is a special case of the exactness noted in the 
paragraph before the proposition. 

For the assertion that the association is functorial, the left square commutes 
because the verticals are both induced by the same map f,, and the right square 
commutes because the verticals are both induced by the same map f,,_;. For the 
middle square the commutativity follows from the fact that fn—19,—-1 = 9) fn- 


4Some authors use the word “natural” instead of the word “functorial” in this situation. Authors 
who do this may have the notion of “natural transformation” between two functors in mind, or they 
may not. For those who do not, it seems advisable to use a different term like “functorial” to avoid 
confusion. For those who do, the allusion to a natural transformation is at best tortured in this 
instance. A natural transformation refers to two categories C and C’, and the most intuitive choice 
for C here is the category of chain complexes X. There are to be two functors from C to C’ and the 
natural transformation relates the values of those functors on X, for each X; no second complex X’ 
enters into matters. To have X’ involved in a natural transformation would mean including at least 
two chain complexes in each object of C. In other instances, however, some additional structure 
may be present. Then the distinction between “functorial” and “natural” may be one of emphasis 
concerning the data. The statements of Propositions 4.29 and 4.30 below provide examples. 
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As was mentioned in Section 1, our interest will be in functors F : C > C’ 
between two good categories, not necessarily involving the same ring, with the 
property of being additive. This means that F (gy, + g2) = F(¢,) + F (@2) when 
gy and @ are in the same Home(A, B). 

An additive functor sends any 0 map to the corresponding 0 map. Consequently 
it always sends complexes to complexes. Moreover, since any functor carries the 
identity map of each Homg(A, A) to an identity map, an additive functor has to 
send any module A for which the 0 map and the identity coincide to another such 
module. The 0 module is the unique module A with this property, and thus an 
additive functor has to send the 0 module to a 0 module. 

Moreover, additive functors carry finite direct sums to finite direct sums. (Re- 
call that good categories are closed under finite direct sums.) This fact needs 
proper formulation, and we need first to express direct sums in terms of modules 
and maps. From the point of view of category theory, we shall take advantage 
of the fact that for left R modules, product and coproduct coincide and are given 
by direct sum. If C = A © B, then there are thus projections p, : C — A and 
Pp: C — B and injections t4 : A — C andig : B — C such that 


Pata=1a and Pale = 1B, 


Pala =0 and Palg = 0, 


and 

lapaAt lBpB = lc. 
Conversely if we have maps p,, la, Pg, and tg with these properties, then the 
modules A = image p,4 and B = image pz have the property that C is the internal 
direct sum C = (4A @ UgB, and ty, and tg are one-one. In fact, the equation 
LAPA +lpppB = 1c shows thati,A +tgB =C. To see thatr,z,ANigB = 0, 
let x be in the intersection. Then pgx lies in pgt,A, which is 0, and pax lies in 
patgB, which is 0. Thus t4p4 + tape = lc givesO = tapax + lppax = xX. 
Hence (4A NtgB =O and C = 14A OizgB. Finally the equations pala = la 
and pgig = 1g imply that 4 and vg are one-one. 

With direct sum now expressed in terms of modules and maps, let us return to 
the effect of additive functors on direct sums. Let C = A@ B, and let pa, pp,ta, 
and tg be as above. Suppose that the additive functor F is covariant. Applying F 
to the displayed identities in the previous paragraph and using that F is additive, 
we see that F (pa), F(pp), F(t), and Fg) have the properties that allow us to 
recognize a direct sum. Hence 


F(C) = F(a) F(A) © Fa) F(B) 
with F(1,4) and F(tg) one-one. Thus 


F(C) = F(A) F(B). 
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If instead F is contravariant, then the roles of the projections and the injections 
get interchanged, but we still obtain F(C) = F(A) ® F(B). 

An additive functor F : C — C’ between two good categories is exact if it 
transforms exact sequences into exact sequences. Proposition 4.3 below will show 
that exact covariant functors preserve kernels, images, cokernels, submodules, 
quotients, and more. However, exact functors occur only infrequently; we shall 
see a few examples of them in Section 4. For examples of failures at exactness, 
it was shown in Section X.6 of Basic Algebra that if 


g v 


O—- M> N—->P-—O0 


is a short exact sequence in the category Cr, if FE is a unital left R module, and if 
E’ is a unital right R module, then the following sequences in Cz are exact: 


E' @pM —2* ET @rN —%, F’ @p P—> 0, 


0 —> Homa(E, M) 2”, Home(E, N) 228° Homa(E, P), 


Homp(M, E) oe. Homa(N, E) oe. Homa(P, E) <— 0; 
on the other hand, the extensions of these complexes to 5-term complexes by the 
adjoining of a0 need not be exact, and thus the functors E’@r(-),Homa(F, -), 
and Home(-, £) are not exact for suitable choices of R, E, and E’. 


Proposition 4.3. An additive functor F : C > C' between two good categories 
is exact if and only if it carries all short exact sequences into short exact sequences. 


REMARK. This proposition makes it a little easier to test concrete additive 
functors for exactness than it would be from the definition. 


PROOF. Necessity is obvious. For sufficiency, let 
Ap Te 
be exact, and let the additive functor F' be covariant, the contravariant case being 
completely analogous. Put A; = kerg, Bj = kerw, and C; = image w. Since 
we = 0, we can factor g as g = g2¢1, where g; : A > B, is ¢ with its range 
space reduced and where @ : B; — B is the inclusion. Similarly we can factor 


was wv = wow, where 1 : B —> C, is ¥ with its range space reduced and 
where Wz : Cy; — C is the inclusion. Of the sequences 


0— A, yee 


> By > 0, 


0 > By ty anal > 0, 


OO Pe IG 5.0: 
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the first and the third are trivially exact, and the second is exact because ker yy = 
kery = imagey = image@. The hypothesis that F carries short exact se- 
quences to short exact sequences thus implies that the three sequences 


FG. een 1: eo 
Fay! he GO). 


ORC RO) 


are exact. From these, ker F(w1) = image F'(g2). Also, F(w2) is one-one, so 
that 


ker F (Wi) = ker (F (Wo) F (Wi) = ker F(W), 
and F'(¢ 1) is onto, so that 
image F'(g2) = image (F (~2)F (g1)) = image F'(¢). 
Hence ker F' (yr) = image F (g), and 


WORRY) FC) 


F(A) 


is exact, as required. 


Proposition 4.4. Let F : C — C’ be an additive functor between good 
categories, let X be a complex in C, and let F(X) be the corresponding complex 
in C’. If F is exact, then F carries the homology or cohomology of X to the 
homology or cohomology of F(X). 


REMARKS. Our convention is to refer to homology when the indexing goes 
down and cohomology when the indexing goes up. If F is covariant, it preserves 
the indexing, while if F is contravariant, it reverses it. For the proof we shall use 
notation A, B,C for modules that is neutral with respect to the indexing. The 
arguments are qualitatively different in the covariant and contravariant cases, and 
we shall give both of them. 


PROOF IN THE COVARIANT CASE. Let 
ALB YC 
be a given complex, thus having yg = 0, and form the image complex 


FANS Ss FQ) FO: 
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We are to exhibit an isomorphism 
F (ker w/ image g) = ker F(w)/ image F (¢). (*) 
Let i : imageg — kerw and j : kery — B be the inclusions, and let 
q : kerW — kerw/imageg be the quotient map. Applying F to the exact 
sequence 
0 —> image gy means ker y ots ker y/ image g —> 0 

and using exactness, we obtain an isomorphism via F(q): 

F (ker y/ image g) = F (ker W) / F @)F (image Q). (+) 


Since j is one-one and F is exact, F(j) is one-one. Thus application of F (j) to 
the right side of (+**) gives 


F (ker y/ image g) X F(j)F (ker w) / F (ji) F (image g). ) 
If ¢@ denotes ¢ with its range reduced to its image, then g = jig. Applying F to 
the two exact sequences 
ker wy 2s B aA Cc, 
As image y — 0 


gives us F(j)F (ker) = ker F(w) and F (image gy) = F(@)F(A). Applying 
F (ji) to the second of these and substituting both into the right side of (+) 
transforms (+) into (*) and gives the required isomorphism. 


PROOF IN THE CONTRAVARIANT CASE. Let 
Pa ey 
be given with yg = 0, and form the image complex 


FA 2s FB) SFO: 


We are to exhibit an isomorphism 
F (ker w/ image gy) = ker F'(g)/ image F(y). (*) 


Let j : ker — B be the inclusion, let j : ker y/ image pg — B/image w be 
the induced map between quotients, and let g, q’, q” be the quotient maps 
q:B— B/kery, 
q': B > B/imageg, 
q’ : B/imageg > B/kery. 
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These satisfy g = q”’q'. Applying F to the exact sequence 


0 — ker w/ image g sud B/ image g ts B/kery — 0 
and using exactness, we obtain an isomorphism via F (j): 
F (ker y/ image yg) ~ F(B/ image y)/F (q") F (B/ker y). () 


Since q’ is onto and F is exact, F(g’) is one-one. Thus application of F(q') to 
the right side of (**) gives 


F (ker yy/image gy) = F (q')F (B/ image g)/F(q)F(B/kery). — (#) 


Applying F to the three exact sequences 


A> B a B/ image ¢, 
ker wy ae B a C, 
ker y dle pS B/kerw 
gives us F(q')F (B/image gy) = ker F(g) and F(q)F(B/ker w) = ker F(j) = 


image F'(y). Substituting both these equalities into the right side of (+) trans- 
forms (+) into («) and gives the required isomorphism. 


We were reminded before Proposition 4.3 that Home and ® need not yield 
exact functors. The partial exactness that they exhibit, as opposed to exactness 
itself, is more typical of additive functors, and we incorporate this behavior into 
two definitions. We shall define left and right exactness in such a way that Home 
is left exact in each variable and ®p is right exact. An additive functor F is left 
exact if the exactness of 


O—-A SB YE > 0 


implies the exactness of 


0 — F(A) —®> F(B) *. F(C) —_ F covariant), 


0 — F(c) —®. F(B) Ss F(A) CF contravariant). 
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We say that F is right exact if the exactness of the sequence with 0, A, B,C,0 


above implies the exactness of 
F pews FW) ; 
F(A) —~> F(B) — F(C) — 0 (F covariant), 


F me 


roe ue ese F(A) — 0 (F contravariant). 


The words “left” and “right” refer to the part of the target sequence that is exact 
when the arrows are arranged to point to the right. A consequence (but not the 
full content) of these definitions in each case is an assertion about one-one or 
onto maps. For example a left exact covariant F carries one-one maps to one- 
one maps; we have only to start from a one-one map g : A — B and set upa 
short exact sequence with C = B/ image ¢, and the definition shows that F(g) 
is one-one. 


Proposition 4.5. If F is a covariant left exact functor, then F' carries an exact 
sequence 


Osa SepSe 


into an exact sequence 
Os Fiay 2 (ay 23 FO: 
REMARK. The expected analogs of this result are valid if F is contravariant or 
if F is right exact or both. 


PROOF. Starting from the given exact sequence, let i : image yy — C be the 
inclusion, and let Ww: B = image w be w with its range space reduced. Then 
w =ivw, and the sequences 


0 >A ~>B Y. image y —> 0 


and 0 —— image w —.C —> C/image y —> 0 
are exact. Applying F and using its left exactness, we see that 


0 —> F(A) 2, F(R) 2. FGmage W) 


and 0 —> F(image ») ~% F(C) 


are exact. Thus F(i) is one-one, and F(w) = F(iw) = F(i)F(W) has the 
same kernel as F(y). The exactness of the first image complex shows that 
ker F (wv) = image F(¢), and the proof of the required exactness is complete. 
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3. Long Exact Sequences 


As in Section 2, let C be a good category. We have seen that chain complexes 
in C themselves form a category whose morphisms are chain maps. If we have 
several chain maps in succession, each with an index n € Z, we can say that 
they form an “exact sequence” of chain maps if for each n, the sequences of 
modules and maps having index n form an exact sequence in C. Our objective 
in this section is to show that any short exact sequence of complexes of this kind 
yields a “long exact sequence” of modules and maps in C involving all the indices. 
More precisely we are able to construct for each n a “connecting homomorphism” 
relating? what happens with each index n to what happens for index n +1 orn—1 
and incorporating modules and maps for all indices into a single exact sequence 
of infinite length. 


By way of preparation for the construction of connecting homomorphisms, let 
us be more explicit about the discussion in Section 2 of how a chain map carries 
the homology of one complex to the homology of another complex. Let 


A sR 


fe. 


eee 


be acommutative diagram in the good category C. Let us observe that g(kera) C 
ker B; in fact, any a € kera has 0 = g’a(a) = Be(a), and thus g(a) is in ker B. 
Let us observe further that g’(a@(A)) = B(g(A)) € B(B); since g’ carries A’ into 
B’, it follows that y’ descends to a mapping @’ defined on A’/a(A) = cokera 
and taking values in B’/B(B) = coker B. We can summarize these remarks by 
the inclusions 


y(kera) C ker B and @ (coker a) C coker B. 


Using these remarks, we can now construct a “connecting homomorphism” when- 
ever we have a diagram as in Figure 4.1 below. 


>For readers familiar with the use of homology in topology, connecting homomorphisms arise 
when one works with the homology of a topological space, the homology of a subspace, and the 
relative homology of the space and the subspace; the construction in this section may be regarded 
as an abstract version of that construction. 
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A > B > C > 0 
je ae. We 
0 SAS Wyre oe! 


FIGURE 4.1. Snake diagram. The rows are assumed exact, and the squares 
commute. In this situation the Snake Lemma constructs 
a connecting homomorphism w : ker y — cokera. 


Lemma 4.6 (Snake Lemma). In a good category C, a snake diagram as in 
Figure 4.1 induces a homomorphism w : ker y > coker a@ with 


ker wm = (ker B) and image w = gy’! (image B)/ image a, 
and with w(c) = yg’! (B(w7!(c))) + image a for c € ker y, and then 


kera —> ker B aS ker y —s cokera > coker B Ba coker y 
is an exact sequence. Here @ and y are restrictions of yg and w, and @ and Vv 
are descended versions of g and w. If g is one-one, then @ is one-one. If w’ is 
onto C’, then Vv is onto coker y. 


REMARKS. The homomorphism @ is called a connecting homomorphism. 
The name “Snake Lemma” comes from the pattern that the six-term exact se- 
quence makes when superimposed on the enlarged version of Figure 4.1 shown 


in Figure 4.2. 
kera —~ kerB —— kery 


| | | 


A ——> B —> C —> 0 


0 — > A > BP > C 


| | | 


cokera ———> coker 8 ——— cokery 


FIGURE 4.2. Enlarged snake diagram. 
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PROOF. First let us construct w and see that it is well defined. Let c be in 
ker y. Since w is onto C, write c = w(b) for some b € B. The commutativity of 
the second square in Figure 4.1 gives 0 = y(c) = yW(b) = W’(6b). Thus B(b) 
is in ker y’ = image g’, and B(b) = ¢g'(a’) for some a’ € A’; the element a’ is 
uniquely determined, since g’ is one-one. Define w(c) = a’ + a(A). 

The only choice in this definition is that of b, and we are to show that any 
other choice leads to the same member of cokera. If b is another choice and if 
B(b) = y'(a@’) witha’ € A’, then w(b — b) = c—c = 0 shows that b— b = g(a) 
for some a € A. Thus g'(a’ — a’) = B(b — b) = By(a) = ¢'(a(a)). Since 9’ is 
one-one, a’ — a’ = a(a), and a’ and a’ are exhibited as in the same coset of A’ 
modulo a(A). 

Let us compute kerw. Suppose that w(c) = 0, i.e., that w(c) is in a(A). 
Say w(c) = a(a). By construction of w, w(c) = a’ + a(A) for an element 
a’ € A’ such that B(b) = g’(@’) and c = w(b). In this case, a’ = a(a). 
So B(b) = g’a(a) = Beta), and thus b — g(a) is in ker 8. Consequently 
c= w(b) = w(b) — We¢@ is in w(ker B), and kerw C wWikerB). For the 
reverse inclusion, if c is in y(ker 6B), choose b € ker B with y(b) = c. Then 
y(c) = yw(b) = w’B(b) = 0 shows that w(c) is defined. Since c = w(b), the 
construction of w shows that B(b) = g’(a’) for some a’ € A’. Since b is in ker B 
and since g’ is one-one, this a’ must be 0. Then w(c) = a’ + a(A) =0+a(A), 
c is in kera, and y(ker B) C kerw. 

Now we compute imagew. Our step-by-step definition of @ shows that 
image w C g’~! (image B)/a(A). For the reverse inclusion, suppose that a’ € A’ 
is in g’~|(image B), i.e., has g(a’) = B(b) for some b € B. Then the element 
c = g(b) of C has y(c) = yw(b) = WB) = W'¢9'@’) = 0, and w(c) 
is therefore defined. Our definition of @ makes w(c) = a’ + a@(A), and thus 
g’—| (image B)/a(A) C image w. 

We are left with establishing the exactness of the displayed sequence of six 
terms at the four positions other than the ends and with proving the two assertions 
in the last sentence of the lemma. 

The condition of exactness at ker B is that y(kera~a) = kerw M kerf. The 
inclusion C follows from the equalities 0 = wg and By(kera) = g’a(kera) = 0. 
For the inclusion >, let b € B satisfy y(b) = B(b) = 0. Exactness at B gives 
b = g(a) with a € A. Then 0 = f(b) = Be(a) = ¢y’a(a) with gy’ one-one 
implies that a(a) = 0, and a is in kera. Thus b is in g(kera), and exactness at 
ker £ is proved. If ¢g is one-one, then certainly its restriction @ is one-one. 

The condition of exactness at ker y is that kerwm = w(ker ), and this was 
proved in the third paragraph of the proof. 

By the result of the fourth paragraph, the condition of exactness at cokera 
is that g’~'(B(B))/a(A) equal ker@’, where @’ : A’/a(A) — B’/B(B) is the 
map induced by gy’. The members of ker@’ are those cosets a’ + a(A) with 
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g'(a’ + a(A)) © B(B). Since y’a(A) = Boy(A) © B(B), the condition on 
a’ + a(A) is that g’(a’) be in B(B), hence that a’ be in y’~!(B(B)), hence that 
the coset a’ + a(A) be in g’~!(B(B))/a(A). Thus we have exactness at coker a. 
At coker 6, we know that the descended map @’ maps coker a into coker B, and 
we are to show that 9’ (coker a) ker Vv. Inclusion C follows because y’gy’ = 0 
implies ¥ G'(a’ +a(A)) = ¥ (y'(a’ )+ B(B)) = w'e'@')+y(C) = v(C). For 
the reverse inclusion let b’ € B’ have Vv (b' + B(B)) = y(C). Then w’(b’) is in 
y(C). Since w : B > C is onto, we can find b € B with W'(b’) = yw(b) = 
w’'B(b). Hence b'— B(b) is inker yw’ = image gy’, and b’— B(b) = ¢'(a’) for some 
a’ € A’. Consequently b’ + B(B) = 9'(a’) + B(b) + B(B) = '(a') + B(B) = 
(g’)(a’ + a(A)), and b’ + B(B) is exhibited as in (g’),(a’ + a(A)), ie., in 
(y’),(coker w). Thus we have exactness at coker 8. Finally if yw’ is onto C’, then 
certainly its descended map Vv is onto coker y. This completes the proof. 


Theorem 4.7. Let A = {(A,, @,)}, B = {(Bn, Bn)}, and C = {(Cp, yn)} be 
chain complexes in a good category C, and suppose that g = {y,}: A — B and 
w = {W,}: B— C are chain maps such that the sequence 


O—-A ee) Yc > 0 


of chain complexes is exact. Then this exact sequence of chain complexes induces 
an exact sequence in homology of the form 


Picuatne 1 


> Ang (C) > Hy (A) > Ay (B) > Fy (C) A> Hy-1(A) 


Here the map @, : Hy41(C) — H,(A) has descended from the connecting 
homomorphism @, defined on ker y, in C,4,; and having range cokera, = 
A, / image ay. 


REMARKS. 

(1) The exact sequence in homology is called the long exact sequence in 
homology corresponding to the short exact sequence of chain complexes, and the 
maps w, are called connecting homomorphisms. As the proof will show, these 
connecting homomorphisms arise by two applications of the Snake Lemma, not 
just one. 


(2) In more detail the diagram of the short exact sequence of chain complexes 
is of the form 
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- ff 


OsaSs fs Se pe a ae AS 


[on [6-1 [nn 


0 aoe Ages Pn-1 Bn-1 Wn-1 Cis a 0 


- 2 @ 


The rows are exact, the columns are chain complexes, and the squares commute. 


(3) The corresponding result for cochain complexes involves the diagram 


T T T 


Pn+1 Wn+i 
i ie i Bey IC a heh 


Tb, Ty 


Qe a Sy. ee Ses es, 


[en [ons Jr 


0 ane Aye Pn-1 Bos Wn-1 Cp. cane 0 


a ae 


Qn 


and the corresponding long exact sequence in cohomology is 


ves HC) 2 H(A) > HB) > H(C) 2s HCA) . 


The result for cochain complexes is a consequence of the result for chain com- 
plexes and follows by making adjustments in the notation. 
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PROOF. We regard the top two displayed rows of the diagram in Remark 2 as a 
snake diagram. Applying the Snake Lemma (Lemma 4.6), we obtain a connecting 
homomorphism w, and an exact sequence 


y => 


Pn Watt oO) g v 
ker a, ——> ker B, ——> ker y, —> coker a, —> coker B, —> coker yn. 


Using Proposition 4.2 for each of the chain complexes A = {(Ajy, @,)}, B = 
{(Bn, Bn)},and C = {(Cy, Yn)}, we see that we obtain a diagram 


0 0 0 
A, (A) Ay, (B) H,(C) 


v Vv v 


ig Vv, 
cokera, —> cokerB, — > cokery, —> 0 


@n-1 Bn-1 Yn-1 
v vw vy 


Pn-1 Wn-1 
O —> keran_2 —— ker By-2 ——> ker yy_2 


Ay—\(A) H,-\(B) An—1(C) 
0 0 0 


in which the rows and columns are exact and the squares commute. The third 
and fourth rows form a snake diagram, and the second and fifth rows identify the 
kernels and cokernels. Thus the Snake Lemma gives us an exact sequence 

Gn Vn Q Gn Vat 
A, (A) —> Hy(B) — Hn(C) — Hn-1(A) —> An-1(B) — > An-1(©) 
for a suitable connecting homomorphism &2. Repeating this argument for all n 
proves exactness at all modules of the long exact sequence. 


To complete the proof, we have only to identify Q. Reference to the statement 
of the Snake Lemma shows that the formula for Q is 


Q@ = G1)" Br1G, O)) + image ®_ 


for c € H,(C). Meanwhile, the connecting homomorphism from the first appli- 
cation of the Snake Lemma is w,_)(c) = (gi)! (Bn-1 (Wy! (c))) + image a@,_1 
for c € ker yp_,. Thus Q(c + image y,) = @,_1(c) + image a@,_ 1 as asserted. 
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Corollary 4.8. If 


O—-A °, B ise > 0 


is an exact sequence of chain complexes in a good category and if A is exact, 
then H,(B) = H,(C) for all n; if instead C is exact, then H,(A) = H,(B) for 
all n. Consequently if any two of the three chain complexes are exact, then the 
third one is exact. 


PROOF. Theorem 4.7 gives the long exact sequence 


+++ —> Ay4i(C) — Ay (A) — An(B) — An(C)— An-1(A) >: :: - 


If H,(A) = Oand H,,_;(A) = 0, then we see that H,(B) = H,(C). If An41(C) = 
0 and H,,(C) = 0, then we see that H,,(A) = H,,(B). 

If two of the three chain complexes are exact, then one of the two is A or C, 
and the result in the previous paragraph applies. Then the other two complexes 
(B and C, or A and B) have isomorphic homology. The hypothesis says that one 
of these two sequences of homology groups is 0. Therefore the other one is 0. 


To conclude the discussion, we shall prove results saying that the exact se- 
quences produced by Lemma 4.6 and Theorem 4.7 are functorial. 


Lemma 4.9. In a good category C, the six-term exact sequence that is obtained 
from a snake diagram as in Figure 4.1 is functorial in the following sense: If there 
are two horizontal planar snake diagrams, one with tildes (~) over all modules 
and maps and the other as is, and if there are vertical maps fi, etc., in three 
dimensions from the tilde version of the snake diagram to the original version 
such that all vertical squares commute, then the squares of the diagram 


mh 


kerd —> ker B ee ker —> coker@ —> coker B ee coker 7 


[s iz lie [ie lie iz 


kera —°> ker B aN ker y —. cokera > coker B jaa coker y 


all commute. 


PROOF. For the first square from the left, the assumed commutativity shows 
that faa = af,, and thus x € kera@ implies f4(x) € kera; similarly x € ker B 
implies fg(x) € ker 8. Thus the maps of the square are well defined. We are 
given also that pf, = fg@, and this proves that the square commutes. The second 
square from the left is handled similarly. 
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For the fourth square from the left, the equation f4a@ = af, shows that 
y = @(x) implies f(y) = a(fa(x)), and thus y € image@ implies f4'(y) € 
image a; this means that f4, descends toa map fy of coker & to coker a. Similarly 
fp descends to a map fp of coker B to coker 8. Thus the maps of the square 
are well defined. We are given also that g’ f4, = fgg’, and this proves that the 
square commutes. The fifth square from the left is handled similarly. 

We are left with the third square from the left. The map at the left side of this 
square was shown to be well defined in the first paragraph of the proof, and the 
map at the right side of this square was shown to be meaningful in the second 
paragraph of the proof. We are to prove that the square commutes. Referring to 
the construction of @, let ¢ be in ker 7, choose b in B with vb) = = C, and write 
Bib) = = g(a’). Then @(C) is defined to be the coset of a’. Using the assumed 
Siena: we compute that fg (b) = fewb) = fc(C) and that 


gy fa@’) = fxO'@) = fe Bb) = Bfe(b). 
Thus fp (b) is an element whose image under yw is fc(C), and 6 of this element 


is gy’ f4(a’). Consequently the coset of w(fc(C)) is to be the coset of fy’) = 
fa'@(c). This proves the desired commutativity. 


Theorem 4.10. In a good category C, the long exact sequence that is obtained 
from a short exact sequence of chain complexes as in Theorem 4.7 is functorial 
in the following sense: if there are two short exact sequences of chain complexes 
as in the theorem, one with tildes (~) over all modules and maps and the other 
as is, each viewed as lying in a horizontal plane, and if there are vertical maps 
fa,» €tc., from the tilde version of the exact sequence of chain complexes to the 
original version such that all vertical squares commute, then the squares of the 
diagram 


SoCs Pay = ee) BO = aS 


[Foun oo im [fe [Foe 


On-1 


— js amen (os ee aeae H,,(A) BELLE H,,(B) Bie H,(C) —> H,-\(A) — 
all commute. 


PROOF. Theorem 4.7 was proved by three applications of Proposition 4.2, 
which includes its own assertion of functoriality, and two applications of Lemma 
4.6, whose functoriality is addressed in Lemma 4.9. The argument involved only 
manipulations with diagrams, and functoriality is in place for every step. Hence 
functoriality is in place for the end result, and passage to the long exact sequence 
is functorial. 
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In Section II.5 we exploited the fact that certain complexes were exact and 
involved free modules in order to obtain chain maps and homotopies. The 
hypothesis “free” entered the arguments through Propositions 3.25 and 3.27; 
in both cases an R homomorphism was to be constructed from a free R module 
to some other R module, and a computation revealed how the R homomorphism 
should be defined on free generators. The universal mapping property of free 
modules allowed the R homomorphism to be extended from the generators to the 
whole free module. Examination of those arguments shows that it is enough to 
assume that the domain on which this R homomorphism is to be constructed is a 
“projective” R module, in the sense to be defined below, and we begin with that 
notion. 

Let C be a good category of unital left R modules. We say that a module P in 
this category is projective in C or is a projective in C if whenever a diagram in 
the category is given as in Figure 4.3 with w mapping onto B, then there exists 
o : P + C inC such that the diagram commutes. 


P 


T "2 
4 


0 < Ae C 


FIGURE 4.3. Defining property of a projective. 


If P is a free R module in C, then P is projective in C. In fact, for each free 
generator x of P, we choose an element c, in C with w(c,) = t(x). Then we 
define o (x) = c, and extend o to a homomorphism. We give further examples 
of projectives shortly. First let us establish in Lemma 4.11 an ostensibly stronger 
property that projectives automatically satisfy. 


Lemma 4.11. If P is projective in the good category C and if the diagram 
P 


T So 
4 


g wv 


A’ < A < A’ 


in C has ker g = image w and gt = 0, then there exists amap o : P > A” inC 
such that the diagram commutes. 


PROOF. The hypotheses force imaget C kerg = image yy. Thus if we put 
B = image Ww andC = A”, then the above diagram leads to the diagram in Figure 
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4.3. The hypothesis “projective” therefore gives us the map o in Figure 4.3 with 
Tt = Wo, and the same o is the required map here. 


EXAMPLES OF PROJECTIVES. 


(1) If R is a field F and if C is the category of all vector spaces over F’, then 
every module is free, hence projective, since every vector space has a basis. 


(2) For general R, if C = Cp is the category of all unital left R modules, 
then the projectives are the direct summands of free modules. This fact is easily 
verified from Figure 4.3 as follows: In one direction if F = P @ P’ is a free 
R module and the diagram in Figure 4.3 is given, extend t to F as 0 on P’, 
find o from the fact that the free module F is projective, and restrict o to P. 
In the other direction if P is projective, find a free R module F mapping onto 
P by a map w, and put B = P,C = F,and rt = 1 in Figure 4.3. Then the 
equality 1p = t = yo forces o to be one-one, and it follows that P = imageo. 
Consequently F = imageo @ ker w. 


(3) For R = Z, the category C = Cz of all unital R modules is the category 
of all abelian groups. Then the projective modules are the free abelian groups by 
(2), since any subgroup of a free abelian group is free abelian. 


(4) For R equal to any (commutative) principal ideal domain, the projective 
modules in the category Cr of all unital R modules are the free modules, by 
the same argument as in (3) in combination with the Fundamental Theorem of 
Finitely Generated Modules (Theorem 8.25 of Basic Algebra). 


(5) For R = Z, two good categories that were listed in Section 2 were the 
category of all finitely generated abelian groups and the category of all torsion 
abelian groups. With the first of these, the projectives are the free abelian groups 
of finite rank, by the same argument as in (3). With the second of these, Problem 1 
at the end of the chapter asks for a verification that some module in the category 
fails to be the image of any projective in the category. 


We come to the main result concerning flexibility in setting up chain complexes. 
This result generalizes Proposition 3.25 through Corollary 3.30 in Section HI.5. 


Theorem 4.12. Let X = {(Xn, dn)}°2_,, and X’ = {(X/,, 0}. be chain 
complexes in the good category C, and letr be an integer. Let {fn : Xn > X),}n< 
be afamily of mapsinC such that 0" _| fn = fn—19n—-1 forn <r. If X;, is projective 
forn > r and X’ is exact at each X’, withn > r,then{f, : X, > X'}n<- extends 
to a chain map f : X — X’, and f is unique up to homotopy. More precisely 


any two extensions are homotopic by a homotopy / such that h, = 0 forn <r. 


REMARKS. The diagrams in question are 
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C) r) = On— 
n+1 ben n N “a n—1 N be n—2 
| | 
1 
(oe |* {fo 
a a, a a, 
/ n , n—1 n n—2 
X ntl >» xX ‘ » X nl Se Meh eleees 


for the construction of the chain map and 


On+1 On On-1 
SS Kg es 2. Pp eee 


[tos Par [Jon J [ps for [ha 
L 


a a, 
SS Ma 


n—-1 


——> 


, n+1 in 
. X42 X41 


for the construction of the homotopy. 


PROOF. For the existence of the chain map, it is enough by induction to 
construct f,,. Matters are therefore as in the first of the above diagrams with 
n =r. Since X’ is exact at X’. and X,4, is projective, we are in the situation 
of Lemma 4.11 with P = X,4,, A” = Ais AjXi A =X av Hs, 
g = 0'_,,andt = f,0,. The lemma gives a mapo : P > A” with yo =T. 
If we take f-+1 = 0, then wo = T says that 0! f-41 = f-0,, and the inductive 
construction of the chain map is complete. 

For the uniqueness up to homotopy, let f : X — X’ and g : X > X’ 
be two chain maps such that f, = g, form <r. Define h, : X, > pees 
to be O for n < r, and observe that the system of functions {h,},<, satisfies 
hn—10n—1 + hy = fn — Bn forn <r because f, = g, forn <r. Proceeding 
inductively, suppose thats > r and thath, has been constructed forn < s suchthat 
hn—10n-1 + 0),tn = fn — Sn forn < s. We are to construct hs41 2 Xs41 > Xs 
This is the situation of the second diagram above with n = s. Since s > r, X’ 
is exact at X 4 and X5+41 is projective. Thus we are in the situation of Lemma 
4.11 with P = X41;,A" = Xi,,,A=Xj,,,A4 =X, = 0,,,9 = Of, and 
T = (fs41 — 8s41) —hsd;. The lemma gives a mapo : P > A” with Wo =T. 
If we take h,,; =o, then wo = 7 says that 0), Asi = (fs41 — 8541) — Ms ds, 
and the inductive construction of the homotopy is complete. 


A resolution in the category C is an exact chain complex X = {(Xn, dn) }P2_,, 


or cochain complex X = {(Xn, d,)}P° such that X, = 0 forn < —2. We say 


n=—OCO 
that the complex is a resolution of X_;, and we abbreviate it as 


Sree we RS eo es. 
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with X* referring to 


ee 


or a seek X?2 < X 1 < Xo 


in the respective cases. A chain complex X = (X* —+ M) that forms a 
resolution is called a free resolution of M if every X,, forn > 0 is a free module. 
It is called a projective resolution of M if every X,, for n > 0 is projective. 


Corollary 4.13. Let M be a module in a good category C and let 
Veer sy and) XS SA 


be two projective resolutions of M. Then there exist chain maps f : X — X’ 
and g : X' > X with f_; = 1y and g_; = ly, and any two such chain maps f 
and g have the property that gf : X — X is homotopic to ly and fg : X' > X’ 
is homotopic to ly. 


PROOF. The existence of f extending f_; = ly is immediate by applying 
the first part of Theorem 4.12 with r = —1. The hypotheses apply because X, 
is projective for n > —1 and X’ is exact at Xj, forn > —1. A similar argument 
shows the existence of g. 

If we have f and g, then gf : X — X andly : X — X are chain maps 
that extend the partial chain map given for n < —1 by ly for n = —1 and by 0 
forn < —2. Since again X,, is projective forn > —1 and X’ is exact at X/, for 
n > —1, the second part of the theorem shows that gf and 1y are homotopic. A 
similar argument shows that fg and ly are homotopic. 


There is an analogous sequence of results that ends with resolutions that are 
cochain maps. They will be equally as useful as the above results when we 
introduce derived functors in the next section. For the results below, the notion 
of a projective is replaced by that of an injective. We say that a module / in the 
good category C is injective in C or is an injective in C if whenever a diagram in 
the category is given as in Figure 4.4 with g mapping one-one from B into C, 
then there exists 0 : B + I inC such that the diagram commutes. 


FIGURE 4.4. Defining property of an injective. 
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We can think of the condition as saying that we can always extend such a t from 
B to C, the extension being o . In any event, we give some examples after proving 
an analog of Lemma 4.11. 


Lemma 4.14. If J is injective in the good category C and if the diagram 


I 
|: on o 
A 


v g > A”, 


A’ > 


in C has ker g = image w and ty = O, then there exists amapo : A” > J inC 
such that the diagram commutes. 


PROOF. The hypotheses force ker tT D> image y = ker gy. Thus t : A —> J and 
g: A — A" descend to maps T: A/kerg > I and@: A/kerg — A”. If we 
put B = A/kerg andC = A”, then the above diagram leads to Figure 4.4 with 
T and @ in place of t and @. The hypothesis “injective” gives us o in Figure 4.4 
with T = o@, and the same o is the required map in the diagram above. 


EXAMPLES OF INJECTIVES. 


(1) If R is a field F and if C is the category of all vector spaces over F’, then 
every module is injective. In fact, in Figure 4.4 we write C = image y ® B’, and 
we let 7 : image g — B be the inverse of g : B — image g. Then we can define 
o to be 0 on B’ and to be tn on image @. 


(2) Let C be the category of all abelian groups (unital Z modules). An abelian 
group G is said to be divisible if for each integer n # 0 and each x € G, there 
exists y € G with ny = x. Two examples of divisible abelian groups are the 
additive group of rationals and the additive group of rationals modulo 1. It is 
easy to see that any quotient of a divisible group is divisible and that direct sums 
of divisible groups are divisible. Let us see for abelian groups that injective is 
equivalent to divisible. 

The argument that injective implies divisible is easy: Let J be injective. Given 
x €landn 40, letB =C =Z, lett : Z > I have t(k) = kx, and let 
g:Z— Zhave g(k) = kn. Setting up Figure 4.4, we obtainag : Z > I 
with t = og. If we put y = o(1) and evaluate both sides at 1, then we obtain 
x =Tt(1) =o(g(1)) = o(n) = no (1) = ny, as required. 

The argument that divisible implies injective uses Zorn’s Lemma. Let J be 
injective, and suppose that B, C, g, and Tt are given as in Figure 4.4. Consider 
the set S of abelian-group homomorphisms o’ having domain a subgroup of 
C containing g(B), having range J, and having o’g = t. Order S by inclusion 
upward of the corresponding sets of ordered pairs. The set Sis nonempty because 
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the homomorphism o’ with domain g(B) and values o'(g(b)) = t(b) lies in 
S; 0’ is well defined because g is assumed one-one. Zorn’s Lemma yields a 
maximal element o in S, say with domain C. We show that C = C. Arguing 
by contradiction, suppose that C is a proper subgroup. Let c be in C but not C. 
The set of integers k with kc in C is an ideal in Z, and we let n be a generator. 
Since / is divisible, there exists an element a in J withna = o(nc). Define o on 
the subgroup generated by c and C by the formula (kc + €) = ka +o (€) for 
k € Zandé € C. We need to check that & is well defined. Ifkc +€=k’c +é, 
then (k — k')c = @ — @ is in C, and thus k — k’ = qn for some integer q. 
Hence o(kc + €) —G(k’c +0) = (kK -—k))at+o(€—-’) =qnat+oa(e-c)= 
qo(nc)+o(€—c’) = qa (nc) —a ((k—k’)c) = qa (nc)—qa (nc) = 0. Therefore 
o is anontrivial additive extension of o , in contradiction to maximality of 0, and 
the proof is complete. 


(3) For R = Z, two good categories that were listed in Section 2 were the 
category of all finitely generated abelian groups and the category of all torsion 
abelian groups. With the first of these, Problem 1 at the end of the chapter asks 
for a verification that some module in the category fails to be a submodule of any 
injective. With the second of these, the injectives are the torsion divisible groups. 


The next proposition extends Example 2 and its proof to general R. Although 
the condition in the proposition is not very intuitive for general R, it has a simple 
interpretation for (commutative) principal ideal domains; see Problem 4 at the 
end of the chapter. 


Proposition 4.15. A unital left R module J is injective for the good category 
of all unital left R modules if and only if every R homomorphism of a left ideal 
J of R into J extends to an R homomorphism R — [. 


PROOF. The necessity is immediate from Figure 4.4 and the definition of 
“injective” if we take B = J,C = R and write t for the given R homomorphism 
of J into I. 

For the sufficiency, suppose that J and a diagram as in Figure 4.4 are given. 
Consider the set S of R module homomorphisms o’ having domain an R sub- 
module of C containing y(B) and having range J such that o’g = Tt, and 
order S by inclusion upward of the corresponding sets of ordered pairs. The 
set S is nonempty because the homomorphism o’ with domain g(B) and values 
o'(y(b)) = t(b) lies in S; o’ is well defined because gy is assumed one-one. 
Zorn’s Lemma yields a maximal element o in S, say with domain C. We 
show that C = C. Arguing by contradiction, suppose that C is a proper R 
submodule of C. Let c be in C but not C. The set of elements r € R with rc 
in C is a left ideal J in R, and the mapping W(r) = o(rc) is a well-defined R 
homomorphism of J into J. By hypothesis, y extends to an R homomorphism 
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W : R + I. Define & on the subgroup generated by c and C by the formula 
o(rc+é) = V(r) + 0(e) forr € R andé € C. We need to check that & is 
well defined. If re + € = r'c + &, then (r — r’)c = @ — Z is in C, and thus 
r —r’ isin J. Consequently U(r) — V(r’) = Wr —r’) =o((r —r’)c). Hence 
o(rcte)—a(r'c+c) = (W(r)—W(r')) +a (€-@’) = ao ((r—-r')c) +a (E-e’) = 
o((r —r’)c) —o((r —r’)c) = 0. Therefore & is a nontrivial extension of o, in 
contradiction to maximality of o , and the proof is complete. 


Now we can prove an analog of Theorem 4.12 for cochain complexes. This 
result had no counterpart in Chapter II. 


Theorem 4.16. Let X = {(X,,d,)}"2_,, and X’ = {(X/,,d/)}2_.. be 
cochain complexes in the good category C, and let r be an integer. Let 
{fn : Xn > Xj }n<r be a family of maps in C such that d!_ | fn-1 = fndn-1 
forn <r. If X is exact at each X, withn > r and X} is injective forn > r, then 
{fn i Xn > Xi }n<, extends to a cochain map f : X — X’, and f is unique up 
to homotopy. More precisely any two extensions are homotopic by a homotopy 
h such thath, =O forn <r. 


REMARKS. The diagrams in question are 


dn—2 dn-1 dn+1 
——> Xn > Xy > Xn4q 
i) 
[s- [+ ees 
v 
d)_5 d. d d’ 
= Ui nl , n , n+l 
x, 1 Xx), 7 Xn4l 


for the construction of the cochain map and 


dn 


dn-1 dn+1 
(Fi Del : > Xn > Xngp —> Xngg — 


[p-: Lm |p Vie [pes ane ez 


xX’ qi xX’ d;, xX’ diy xX’ 
7 TF Ay 7 A), 7 Aniy — 7 Angg — 7 °° 


for the construction of the homotopy. 


PROOF. For the existence of the cochain map, it is enough by induction to 
construct f,,. Matters are therefore as in the first of the above diagrams with 
n =r. Since X is exact at X, and X/_,, is injective, we are in the situation of 
Lemma 4.14 with J = X7,,, A" = X,41, A = X,, A’ = Xr, Ww = G1, 
g =d,,andt =d/f,. The lemma gives a map o : A” > I withog =T. 
If we take f,,; = o, then og = 7 says that f,41d, = d/ f,, and the inductive 
construction of the cochain map is complete. 
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For the uniqueness up to homotopy, let f : X — X' and g: X > X’ be 
two cochain maps such that f, = g, forn <r. Define h, : X, > X}_, to 
be 0 forn <r +1, and observe that the system of functions {h,},<, satisfies 
hngidn + d)_jhn = fn — 8n for n <r because f, = g, forn <r. Proceeding 
inductively, suppose that s > r and that h, has been constructed forn < s+1 such 
that hnyidn +d)_jhn = fn — 8n forn <s. We are to construct hs+2 : Xs42 > 
Xx’ 45s This is the situation of the second diagram withn = s. Since s >r, X 
is exact at X,4, and X{ 44 is injective. Thus we are in the situation of Lemma 
4.14 with J = Xj.),A" = Xs42,A = X541, A’ = Xs, W = ds, 9 = ds44, and 
T = (fs41 — 8541) — djhs41. The lemma gives amapo : A” > I withog =T. 
If we take hs. =o, thenog = T says that hysods41 = (fori — 8541) — dhs, 
and the inductive construction of the homotopy is complete. 


A cochain complex X = (X* <M ) that forms a resolution is called an 


injective resolution of M if every X,, forn > 0 is an injective. 


Corollary 4.17. Let M be a module in a good category C and let 
X=(Xt+< mM and Kate mM 


be two injective resolutions of M. Then there exist cochain maps f : X > X’ 
and g : X’ > X with f_; = ly and g_; = 1y, and any two such cochain 
maps f and g have the property that gf : X — X is homotopic to ly and 
fg: X' — X' is homotopic to ly. 


PROOF. The existence of f extending f_; = ly is immediate by applying 
the first part of Theorem 4.16 with r = —1. The hypotheses apply because X 
is exact at X, forn > —1 and X! is injective forn > —1. A similar argument 
shows the existence of g. 

If we have f and g, then gf : X — X andly: X — X are cochain maps 
that extend the partial cochain map given forn < —1 by 1y forn = —1 and by0 
for n < —2. Since again X is exact at X, forn > —1 and X’, is injective for 
n > —1, the second part of the theorem shows that gf and 1y are homotopic. A 
similar argument shows that fg and 1x are homotopic. 


We conclude with elementary characterizations of projectives and injectives 
that will turn out to be quite useful in the next two sections. We begin with a 
lemma? that will be useful now and will be helpful as motivation in the next 
section. 


©The lemma is a slight variant of Problem 5 at the end of Chapter X of Basic Algebra. 
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Lemma 4.18. Let C be a good category of unital left R modules, and let 


0 >A ae VC > 0 


be an exact sequence in C. Then the following conditions are equivalent: 


(a) B isadirect sum B = B’ @ ker w of modules in C, 
(b) there exists an R homomorphism o : C — B such that yo = lc, 
(c) there exists an R homomorphism t : B — A such that tg = 1,4. 


REMARK. When the equivalent conditions of this lemma are satisfied, one says 
that the exact sequence is split. 


PROOF. If (a) holds, then w ee is one-one from B’ onto C. Let o be its inverse. 
Then o : C > B’ is one-one with yo = Ic. So (b) holds. 

If (b) holds, then any bin B has the property that b—o w(b) has yw (b—o y(b)) = 
w(b) — 1lcw(b) = 0 and is therefore in image g. Write b — oy (b) = g(a) for 
some a depending on b; a is unique because ¢ is one-one. If t : B — Ais defined 
by t(b) = a, then t is an R homomorphism by the uniqueness of a. Consider 
t(y(a)) fora in A. The element b = g(a) has b—oW(b) = g(a) —owg(a) = 
g(a) — o(0) = g(a), and the definition of t therefore says that tT(g(a)) = a. 
Hence tg = 1,4, and (c) holds. 

If (c) holds, then B’ = ker is an R submodule of B. If b is in B’ Nimage g, 
then b = g(a) for some a € A and also 0 = t(b) = ty(a) = la(a) = a. So 
b = 0,and B’Nimage g = 0. If b € B is given, write b = (b— yt(b)) + 9t(b). 
Then gt (b) is certainly inimage g, and t(b—gt(b)) = t(b)—1,4t(b) = Oshows 
that b — yt(b) is in B’. Therefore B = B’ @ image gy. Since image gy = ker vy, 
we see that B = B’ @ ker w and that (a) holds. 


Proposition 4.19. If C is a good category of unital left R modules, then 
(a) amodule P in Cis projective if and only if Home(P, - ) is an exact functor 
from C into Cz, if and only if every exact sequence 


9g 


0 >A > B Yoc > 0 


in C splits when its third nonzero member C equals P, and 
(b) amodule / in C is injective if and only if Homg(-, J) is an exact functor 
from C into Cz, if and only if every exact sequence 


yg 


0 >A > B eg > 0 


in C splits when its first nonzero member A equals /. 
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PROOF. For (a), suppose that P is given. The functor Home(P, - ) is covariant 
and left exact, no matter what P is. Proposition 4.3 shows it is exact if and only if 
it carries short exact sequences into short exact sequences, and the left exactness 
means that the functor is exact if and only if it carries onto maps from B to C 
to onto maps from Homa(P, B) to Homr(P,C). If yy : B —> C is given, then 
Hom(1, %) : Homr(P, B) — Homa(P, C) operates onamapo inHom,(P, B) 
by Hom(1, ¥)(o) = wo. The statement that the equation wo = T is solvable 
for o for each t in Homr(P, C) whenever yw is onto is precisely the statement 
that Figure 4.3 is solvable for o for all possible t’s whenever B —> C —> 0 is 
exact, and thus P is projective if and only if the functor is exact. 

If P is projective and an exact sequence with C = P is given, take t = 1p 
in Figure 4.3. The projective property yields a map o : P > B with yo = |p, 
and Lemma 4.18b shows that the exact sequence splits. 

Conversely suppose that every short exact sequence with P as its third nonzero 
member splits. Suppose that a diagram as in Figure 4.3 is given with wy :C > B 
onto and with t mapping P into B. Let S = C @ P, and let T be the R 
submodule {(c,x) € C ® P | wc) = t(x)} of S. Denote the projections 
of S to C and P by pc and pp, and let j : T — S be the inclusion. The 
map’ ppj carries T onto P; in fact, if x € P is given, then w : C > B 
onto implies that there exists c, € C with w(c,) = t(x). Then (cy, x) lies in 
T, and ppj(cy,x) = pp(cy,x) = x. Consequently we have a 5-term exact 
sequence with terms 0, ker(ppj), T, P, 0, and this must split by hypothesis. 
Thus there exists a map g : P — T with ppjq = 1p. Define o = pcjq. 
For x € P, jq(x) is some member of S$ of the form (c, x) with w(c) = T(x). 
Hence Wo(x) = wpcjq() = Wpc(c, x) = w(c) = T(x). Thus wo = T, and 
o : P — C is the required map that exhibits P as projective. 

For (b), suppose that J is given. The functor Hom,(-, /) is contravariant and 
left exact, no matter what / is. It is exact if and only if it carries one-one maps 
from A to B to onto maps from Homr(B, /) to Homer(A, J). If eg: A> Bis 
given, then Hom(g, 1) : Homr(B,/) — Homag(A, /) operates on a map o in 
Homr(B, /) by Hom(¢, 1)(o) = og. The statement that the equation og = T 
is solvable for o for each t in Homa(A, J) whenever ¢ is one-one is precisely 
the statement that Figure 4.4 is solvable for o for all possible t’s whenever 
0 —> A —> Bisexact, and thus / is injective if and only if the functor is exact. 

If J is injective and an exact sequence with A = / is given, take t = 1; in 
Figure 4.4. The injective property yields a map o : B > I withog = 1,, and 
Lemma 4.18c shows that the exact sequence splits. 

Conversely suppose that every short exact sequence with / as its first nonzero 
member splits. Suppose that a diagram as in Figure 4.4 is given with: A > B 
one-one and with t mapping A into /. Let S = B@/, and let T be the quotient of 


The pair (pc j, pp j) is called the pullback of (7, y). See Problem 35 at the end of the chapter. 
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S by the R submodule {(g(a), —t(a)) | a € A}. Denote the inclusions of B and J 
into S byig andi;,andletk : S > T be the quotient mapping. The composition® 
ki; is one-one from I into 7. In fact, if ki;(x) = 0 for some x ¢€ I, then (0, x) 
is amember of S of the form (g(a), —t(a)) for some a € A; thus g(a) = 0, and 
the fact that g is one-one implies that a = 0 and hence that x = —t(a) = 0. 
Consequently we have a 5-term exact sequence with terms 0,/,7, 7/7, 0, and 
this must split by hypothesis. Thus there exists a mapr : T — I withrki; = 1,. 
Define o = rkig. Fora € A,igg(a) —i;t(a) = (y(a), —t(@)) isin kerk. Thus 
kigg(a) = ki;t(a), and og(a) = rkigg(a) = rki;t(a) = 1;t(a) = t(a) for 
a € A. Therefore og = t,ando : A — / is the required map that exhibits J as 
injective. 


5. Derived Functors 


Now we shall undertake the main construction of the chapter, that of “derived 
functors.” Let C be a good category of unital left R modules. Arranging for 
derived functors to be defined on every module in C requires that each module M 
in C have either a projective resolution or an injective resolution, and thus C must 
have either many projectives or many injectives in a suitable sense. Let us make 
the condition precise. 

We say that C has enough projectives if every module in C is a quotient of a 
projective in C. Suppose that this condition is satisfied. Let M be a module in C, 
and let Xo be a projective that maps onto M, say by a map ¢«. Then kere is inC, 
since good categories are closed under the passage to submodules, and we let X 
be a projective in C that maps onto kere, say by a map do. Similarly let X2 be a 
projective that maps onto ker dp in X1, say by a map 01, and so on. The result is 
that we obtain a projective resolution of the form Xt —+ M with X*+ given by 


Xr see > Xo ee Le 


Consequently the condition “enough projectives” implies that every module in C 
has a projective resolution in C. 

Similarly we say that C has enough injectives if every module in C is a 
submodule of an injective in C. Suppose that this condition is satisfied. Let 
M be a module in C, and let Xo be an injective into which M embeds, say by 
amap ¢. Then X0/ image ¢ is in C, since good categories are closed under the 
passage to quotient modules, and we let X be an injective into which Xq/ image ¢ 
embeds, say by a map dj. Let do be the composition of the quotient map from Xo 
to Xo/ image ¢, followed by d*; then do maps Xo into X; with kerdy = image ¢. 


8The pair (kig, ki;) is called the pushout of (7, g). See Problem 35 at the end of the chapter. 
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We let Xz be an injective into which X;/ image dp embeds, say by d*, and we let 
d, be the composition of the quotient map from X, to X;/ image do, followed by 
d‘; then d, maps X into X> with kerd, = image dp. Continuing in this way, we 
obtain an injective resolution of the form X* <— M with X* given by 


Xtes reed X2 < X 1 < Xo. 


Consequently the condition “enough injectives” implies that every module in C 
has an injective resolution in C. 

The category Cr of all unital left R modules certainly has enough projectives. 
In fact, every module in Ce is the quotient of a free R module, and free R modules 
are projective in Cr. It is less trivial but still true that Cr has enough injectives. 
Let us pause for a moment to prove this result in Proposition 4.20 below. 

As is shown in Problems 1-2 at the end of the chapter, other good categories 
of unital left R modules may or may not have enough projectives or enough 
injectives, and a good category may have the one without the other. 


Proposition 4.20. If R is any ring with identity, then the category of all unital 
left R modules has enough injectives. 


PROOF. We treat first the case that R = Z. In view of Example 2 of injectives, 
we are to exhibit an arbitrary abelian group A as isomorphic to a subgroup of a 
divisible group. We know that A is isomorphic to a quotient of some free abelian 
group. Write A = F/S with F a direct sum of copies of Z and S equal to some 
subgroup of F. Taking a Z basis for F and forming a Q vector space with that 
same basis, we can regard F as a subgroup of the additive group D of a rational 
vector space. The group D is divisible, and A is isomorphic to a subgroup of 
D/S. Any quotient of a divisible group is divisible, and thus D/S is divisible. 

Now we allow R to be any ring with identity. We shall make use of various 
results from Chapter X of Basic Algebra. If M is any unital left R module, let us 
denote by FM the underlying abelian group’ of M. If we regard R as an (Z, R) 
bimodule, then Proposition 10.17 makes Homz(R, FM) into a left R module, 
withrg(r’) = g(r'r) forr andr’ in R. The mapping m +> @» with g(r) =rm 
is a one-one R homomorphism of M into Homz(R, FM). From the previous 
paragraph we can find a divisible abelian group with FM C D, and we can then 
regard the left R module Homz(R, FM) as an R submodule of Homz(R, D). 
Consequently we can regard M as an R submodule of Homz(R, D). We are 
going to prove that 7 = Homz(R, D) is injective in Cr. 

We digress for a moment to make a side calculation. With D fixed and N equal 
to any unital left R module, we make use of the isomorphism 


Homeg(N, Homz(R, D)) = Homz(R Se N, D) 


°F is called the forgetful functor from Cpr to Cz. 
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given in Proposition 10.23 of Basic Algebra; in the expression R @p N, the left 
factor of R is to be regarded as a right R module (and not also a left R module), 
and then R @p N is really F(R @pr N) in the sense that the tensor product retains 
only the structure of an abelian group. Meanwhile, Corollary 10.19a gives us 


Homz(R @r N, D) = Homz(N, D); 


here the R on the left is an (R, R) bimodule, and the isomorphism is one of left 
R modules. However, there is no harm in applying F to both sides and obtaining 


Homz(F(R ®r N, D)) = Homz(FN, D). 


Thus 
Homr(N, Homz(R, D)) = Homz(FN, D). (x) 


If we track down the isomorphisms in the results of Chapter X, we see that 
the map from left to right sends g € Homr(N, Homz(R, D)) to the map 
® € Homz(FN, D) with ®(x) = g(x)(1) for x € N, and the inverse sends 
® to g with p(x) (r) = (rn). 

Now we return to J = Homz(R, D). By Proposition 4.19b, J will be injective 
if and only if Home(-, 7) is an exact functor. Since this functor is contravariant 


and left exact, it is enough to prove that if0 —> A uns B is exact in Cr, then 


Hom, (B, 1) 2”, Homa(A, 1) —> 0 (se) 


is exact in Cz. Let us reinterpret (**) in the light of the isomorphism (*) when 
N = Band N = A. If g is in Home(B, Homz(R, D)), then Hom(w, 1)(¢) 
is the member gy of Homr(A, Homz(R, D)). The corresponding members of 
Homz(FB, D) and Homz(FA, D) are ® with ®(b) = g(b)(1) and a member 
®’ of Homz(FA, D) with &’(a) = yy(a)(1). Thus ®’ = ®( Fy), and the map- 
ping Hom(y, 1) in (*) translates under the isomorphisms (+) into the mapping 
Hom(Fy, 1) of Homz(FB, D) into Homz(FA, D). The group D is divisible, 
hence injective in Cz. Since Fw : FA — FB is one-one and D is injective 
in Cz, Proposition 4.19b shows that Hom(Fy, 1) carries Homz(FB, D) onto 
Homz(FA, D). Therefore (**) is exact, and we conclude that J is injective 
in Cr . 


Derived functors of an additive functor F from one good category to another 
will be useful when F is left exact or right exact, and there will be one derived 
functor for each integer n > 0. The value of the n™ derived functor on a module 
M is obtained by taking a projective or injective resolution of M according to 
the rule in Figure 4.5, applying F to the resolution, dropping the term F(M) 
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that occurs in degree —1, and forming the n homology or cohomology of the 
resulting complex. The full traditional notation for the derived functor in question 
appears in Figure 4.5, along with an abbreviated notation that we shall tend to 
use. 

The choice of projective or injective resolution at the start is made in such a 
way that the 0" derived functor is naturally isomorphic to F; this condition will 
be clarified in Proposition 4.21 below. If a projective resolution is to be used, 
one makes the assumption that the domain category has enough projectives; if 
an injective resolution is to be used, one makes the assumption that the domain 
category has enough injectives. 

If the resulting complex obtained by applying F to the resolution is a chain 
complex, the abbreviated notation is F,, for the n™ derived functor; otherwise it 
is F”. The full traditional notation involves using an L or R in front of F to 
denote the one-sided exactness, left or right, that F is not assumed to have, and 
the subscript or superscript n is moved from F to the L or R. 


Exactness —variant Resolution —ology Notation Example 
right co— projective hom— F,, Ly)F M @r(-) 
right contra— injective hom— F,, LyF M ®z Homz(-, J), 
I injective 
left co— injective cohom— | F", R"F Homr(M, -) 
left contra— projective cohom— | F", R"F Homr(-, M) 


FIGURE 4.5. Formation of derived functors. 


There are several things that need elaboration in this definition, and we take 
them up right away. 

First there is the fact that F,,(M) or F”(M) is well defined. Suppose that we 
start with two resolutions X and X’ of M (projective or injective by the rules in 
Figure 4.5). Corollary 4.13 or 4.17 gives us chain or cochain maps f : X > X’ 
and g : X' > X with f_; = ly and g_; = 1y and shows that gf : X > X is 
homotopic to ly and that fg : X' > X’ is homotopic to ly. For definiteness 
let us suppose that F is covariant and right exact; then chain maps are involved 
and the derived functors of F are to be denoted by F,,. Applying F to our chain 
maps, we obtain chain maps F(f) : F(X) > F(X’), F(g) : F(X’) > F(X), 
F(gf): F(X) > F(X), and F(fg) : F(X’) > F(X’). The last two of these 
are homotopic to lrcy) : F(X) — F(X) and to lpyy : F(X’) > F(X’), 
respectively, by F of the respective homotopies. Proposition 4.1 shows that 
F(g)F(f) = F(gf) induces the identity on H,(F(X)) and that F(f)F(g) = 
F (fg) induces the identity on H,(F (X")). Consequently the mappings induced 
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on homology by F(f) and F(g) are two-sided inverses of one another. Thus 
F,,(M) as computed from X is isomorphic to F,(M) as computed from X’. 

Moreover, this isomorphism is canonical. If f’ : X — X’ is another chain 
map, then the same calculation shows that F(f’) and F(g) induce two-sided 
inverses of each other on homology, and hence F(f) = F(f’) on homology. 
Thus F,,(M) is well defined up to canonical isomorphism when F is covariant 
and right exact. The other three situations in Figure 4.5 are handled in similar 
fashion and lead to analogous conclusions. 

Next we make F;, or F” into a functor. To do do, letg : M — M’ be given. For 
definiteness, again let us suppose that F is covariant and right exact. Let X and X’ 
be projective resolutions of M and M’, respectively, and apply Theorem 4.12 to 
produce a chain map ® : X > X’ with ®_; = g. Then F(®) : F(X) > F(X’) 
is a chain map and induces maps on homology that we denote by F,,(g). Here 
F,,(g) maps F,,(M) into F,,(M’). 

Let us see that F,,(@) is well defined. If X is replaced by X, Corollary 4.13 
produces chain maps f : X — X andg: X > X with f_) = ly andg_,; = ly, 
and Theorem 4.12 produces a chain map ® : X > X’ with d_; = g. Since Bo f 
and ® are both chain maps from X to X’ that equal g in degree —1, Theorem 
4.12 shows that © o f is homotopic to &. Similarly ® o g and ® are chain 
maps from X to X’ and are homotopic. By Proposition 4.1, F(® o f) = F(®) 
on homology, and F(® o g) = F(®) on homology. Thus on homology F(®) 
corresponds to F(®) under the canonical isomorphism F'(f), whose inverse on 
homology is F(g). In short, F;,(@) is well defined up to the previously obtained 
canonical isomorphisms. The other three situations in Figure 4.5 are handled in 
similar fashion and lead to analogous conclusions. 

Tracing through the definition of how derived functors affect maps, we see 
that the map 1 goes to the map 1 and that compositions go to compositions, in 
the same order as for F. Thus the derived functors are indeed functors. The 
derived functors of a covariant functor are covariant, and the derived functors of 
a contravariant functor are contravariant. 

We need to check that the derived functors are additive. If g : M — M’ and 
yg’: M > Mare given, then we can proceed as above and use a single resolution 
of M and a single resolution of M’ to investigate y, y’, and g + y’. Then it 
is apparent that the chain or cochain maps built from maps of M to M’ add in 
the same way as the maps, and the result is that each F,, or F” is additive with 
particular choices of the resolutions in place. Allowing the resolutions to vary 
means that we have to take canonical isomorphisms into account, and after doing 
so, we still get additivity. 

If two functors F and G fromC to C’ of the same type in Figure 4.5 are naturally 
isomorphic, then F,, and G,, (or else F” and G”) are naturally isomorphic for all 
n. In fact, if T is the natural isomorphism, then T associates a member T, 
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of Hom(F (A), G(A)) to each module A in C. Take a projective or injective 
resolution X = {X,,} of A, as appropriate, and form the two complexes F(X) and 
G(X). The system {7x,} is then a chain map from F(X) to G(X), with inverse 
{Ty ') and the homology or cohomology of F (X) is exhibited as isomorphic to 
the homology or cohomology of G(X). This much shows that F,(A) = G,(A) 
(or F”(A) = G"(A)) for all n. We omit the details of verifying the naturality of 
this isomorphism in the A variable for each n. 


Proposition 4.21. In the four situations of derived functors in Figure 4.5, under 
the assumption that the domain category for F has enough projectives or enough 
injectives as appropriate, the 0" derived functor of F is naturally isomorphic to F. 


PROOF IF F IS COVARIANT AND RIGHT EXACT. Let 


X{ is > Xo “3M >0 
be the terms in degree 1,0, —1, —2 ofaprojective resolution of M. By Proposition 
4.5 and its remark, the right exactness and covariance of F imply that 


FOG) Ss FG) FO) 30 


is exact. The derived-functor module Fy(M) is computed as the 0" homology of 


F(X) 3 FX) SO 


Thus 
Fo(M) = F(X0)/ image F (09) = F (X0)/ker F(e). 


Since F(€) is onto F(M), the right side here is = F(M) via F(e). 
This establishes the isomorphism. Let us prove that it is natural in the variable 
M.If g: M > M’ is given, we are to prove that the diagram 


Fy(M) 2F®, Fi) 


Fo(y) i [ro (*) 


Fo(M’) @*®, Fu’) 
commutes. Using Theorem 4.12, we form the part of a chain map that is indicated: 


XxX, ay A “+. M > 0 


> es mM’ > 0 
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Application of F gives a commutative diagram 


F(Xo) —“> F(M) 


Fun | ro| 


F(xt) “©. Fim) 


and this becomes («) upon passage to the quotients F(Xo0)/ker F(e) and 
F (X6)/ker F (e'). This completes the proof. 


EXAMPLES. 


(1) The invariants functor F(M) = M® for a group G. Suppose that a group 
G acts on an abelian group M by automorphisms. This situation is completely 
equivalent to considering M as a unital left ZG module, where ZG is the integer 
group ring of G. The subgroup of invariants of M is 


M° ={m EM | gm =m forallg € G}. 


The formulas F(M) = M® for such a module M and F(h) = ee for h in 
Homzc(M, M’) define a covariant additive functor called the invariants functor; 
we can think of F as carrying Czg into itself, but it is preferable to think of it as 
carrying Czg into the category Cz of abelian groups. The functor F is naturally 
isomorphic to the functor H = Homzgc(Z, -), where Z is made into a ZG 
module with trivial G action; as with F’, we consider H as a functor from Czg 
to Cz. To see the isomorphism, we associate to each module M the abelian- 
group homomorphism Ty : M G _, Homz(Z, M) defined by Tu(m) = @m with 
Om(k) = mforallk € Z. Ifh isin Homzg(M, M’), then the two maps Tyo F (h) 
and H(h) o Ty of F(M) into H(M’) are equal, since at each m € M® we have 


H(h)Ty(m) = H(h) (Gm) = Hom(1, 1) (Gm) = hOm = Prim) = Tu F (hy (m). 


This identity means that {Ty} is a natural transformation; we readily check for 
each M that Ty carries M© one-one onto Homz(Z, M), and thus {Ty} is a natural 
isomorphism. 

Because of this natural isomorphism, the invariants functor is covariant and left 
exact. Its derived functors F” or H” are obtained by using an injective resolution 
I <M <0, applying the functor (-)° or Homzg(Z, - ), dropping the term in 
degree —1, and forming cohomology. Briefly 


F"(M) = H"(I°%) = H"(Homzg(Z, 1)) 


for an injective resolution J < M <0. 
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It turns out that the result is given also by the cohomology-of-groups functors 
H"(G, M) even though this was not the procedure by which we obtained group 
cohomology in Section III.5. In fact, what Section III.5 said to do was to start 
from a free resolution (a projective resolution would have been good enough) 
such as P —> M —+ 0 of Z in Cz, apply the contravariant left exact functor 
Homzg(-, M), drop the term in degree —1, and form cohomology. Briefly then, 
Section III.5 said that 


H"(G, M) = H"(Homzcg(P,M)) fora projective resolution P > Z > 0. 


The fact that H”(G, M) can be computed in either of these ways is not particularly 
obvious from what we have done so far, but it will be a special case of the natural 
isomorphism of functors Ext” and ext” that is proved as Theorem 4.31 in Section 7. 
With either formula for H"(G, M), we obtain H°(G, M) = M® in agreement 
with Proposition 4.21. 


(2) The co-invariants functor F(M) = Mg fora group G. In the same setting 
as in Example 1, the subgroup of co-invariants of M is 


Mc = M / (subgroup generated by all gm — m for g € G, m € M). 


The functor F can be seen to be naturally isomorphic to the functor H with 
H(M) = Z zc M. Itis therefore covariant and right exact. Its derived functors 
are given by 


F,(M) = A, (Pe) = Hn(Z @®zeg P) for a projective resolution P > M — 0. 


These are by definition the homology-of-groups functors H,(G, M). Although 
the equality is not particularly obvious, H,(G, M) can be computed also from 


H,(G, M) = H,(P @zg M) fora projective resolution P ~ Z > 0. 


This isomorphism is a special case of the natural isomorphism of functors Tor, 
and tor, that is mentioned just before Proposition 4.29 in Section 7; the proof 
is completely analogous to the proof of Theorem 4.31. With either formula for 
H,(G, M), we obtain Ho(G, M) = Mg in agreement with Proposition 4.21. 


(3) Derived functors with R = Z. For the ring Z and the category Cz (or more 
generally for Cr for any principal ideal domain R), projective resolutions and 
injective resolutions can be fairly short, and derived functors in degree > 2 are 
all 0. Let M be a given unital Z module, i.e., an abelian group. We know that 
M is the quotient of some free abelian group Xo, say with a quotient map ¢, and 
then X; = kere is a subgroup of a free abelian group and hence is free abelian. 
Thus a projective resolution of M is 


inc € 


0 > X] > Xo >M > 0. 
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The kinds of derived functors that make use of projective resolutions are the 
covariant right exact ones and the contravariant left exact ones. If F is sucha 
functor, then we are led to the complexes 


Oa SSr a ye a 


and C2 Fit Fa NS 


in the two cases. Thus the values of the derived functors are Fo(M) = M and 
F\(M) = ker F(e) in the first case, and F°(M) = M and F!(M) = coker F(e) 
in the second case. Higher derived functors are 0. Similar remarks apply to 
injective resolutions and the remaining two cases for derived functors in Figure 
4.5. Every abelian group is a subgroup of a divisible group, which is injective in 
Cz, and the quotient of the divisible group by the given abelian group is divisible, 
hence injective. Thus we can arrange for all terms of an injective resolution to 
be 0 beyond the X, term, and an analysis of the results similar to the one above 
is possible. 


6. Long Exact Sequences of Derived Functors 


The first four theorems of this section say that a short exact sequence of modules 
leads to a long exact sequence of derived functor modules and that it does so in 
a functorial way. Let us suppose that F : C > C’ is an additive functor between 
good categories. For the first of the theorems, suppose further that C has enough 
projectives and that F is one of the types of functors in Figure 4.5 making use of 
projective resolutions in the definition of its derived functors. The last of these 
conditions means that F is to be covariant right exact or contravariant left exact. 

To prove such a theorem, we shall want to apply Theorem 4.7, which produces 
a long exact sequence from a short exact sequence of complexes. To each of the 
modules in the given short exact sequence, we attach a projective resolution. If 
these projective resolutions can somehow be related by chain maps so as to give 
a short exact sequence of projectives in each degree, then we can apply F to the 
entire diagram, invoke Theorem 4.7, and obtain the desired long exact sequence. 
Application of Theorem 4.10, in combination with some further checking, will 
show that the passage from the given short exact sequence of modules to the long 
exact sequence of derived functor modules is functorial in the modules of the 
short exact sequence. 

Thus the problem is to obtain the compatible projective resolutions. Propo- 
sition 4.19a gives us a clue about what to look for: any short exact sequence of 
projectives has to be split. Here is the statement of the first theorem. 
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Theorem 4.22. Let F : C — C’ be an additive functor between two good 
categories. Suppose that F either is covariant right exact or is contravariant 
left exact, and suppose that C has enough projectives. Whenever there are three 
modules and two maps in C forming a short exact sequence 


a YC > 0, 


O—A 


then the derived functors of F on the three modules form a long exact sequence 
in C’ as follows: 


(a) If F is covariant and right exact, then the long exact sequence is 
0 <— F(C) <— F(B) <— F(A) <— F,\(C) <— F\(B) <— F\(A) 
<— Fy(C) <— Fuh) <— fA) <— Pa(C). <= 
(b) If F is contravariant and left exact, then the long exact sequence is 
0 — F(C) — F(B) — F(A) — F'(C) — F!(B) — F(A) 
— F°(C) — F?(B) — F*(A) > F2(C) > --- 
We begin with a lemma. 


Lemma 4.23. In the good category C, suppose that the diagram 


0 0 0 

0 < Anes Ps gan Weis = 5G 
Q iA | 
Vv Vv i Vv 

0 <—— B <*- Ph@® Po <~- Mg <---> 0 
v Pc vi | 
Vv YA Vv 

O < Ces Pc gO ON == 2G 
0 0 0 


has the first two columns and the two rows with solid arrows exact and has P, 
and Pc projective. Here i, is the inclusion into the first component of P4 ® Pc, 
and pc is the projection onto the second component. Then there exist a module 
Mz and maps €g, Weg, ¢1, and yy, such that the whole diagram, including the 
dashed arrows, has exact rows and columns and has all squares commuting. 


212 IV. Homological Algebra 


PROOF. The module P4 @ Pc is in C because C is good, and it is easy to see 
that P4 ® Pc is projective. Let us define ¢g. Since Pc is projective, there exists 
h: Po — B such that wh = €c, and we put €g(x4, Xc) = ve ax, + hxc. Then 
the equation 

PEAXA = €B(XA, 0) = Eplarxa 


says that the upper left square commutes, and the equation 
Wea (Xa, Xc) = WpEaxA + Whxc =0 + EcxXc = EcPc(Xa, XC) 


says that the lower left square commutes. 

To see that ¢g is onto B, let b € B be given. Since pc and &¢ are onto, 
so is €cpc = Weg. Thus we can choose (x4, xc) in Pg ® Pc with w(b) = 
Weéep(xa, Xc). Hence b — €B(Xa, Xc) lies in ker ¥ = image g, and we can write 


b — €p(X4,Xc) = 9(A) = GE A(X 4) = Epia(Xy) = EB (2X4, 0) 


for some x’, € P4. Then b = €g(x4 + x',, xc), and €g is onto. 
Let Mg = kereg, and let Wg : Mg —> Pa © Pc be the inclusion. For my, in 
Ma, let g1(m4) = (Wama, 0). Then g; (m4) is in Mg because 


ep(Wama, 0) = veavama, + hO = GO + h0O = 0. 


Moreover, this definition of g; makes the upper right square commute. 

To define yy, let (x4,xc) be in Mg, so that eg(x4,xc) = 0. Then O = 
Weg(Xa,Xc) = EcPca,Xc) = Ec(Xc), Xc lies in kerec = image Wc, and 
xc = Wc(mc) for a unique mc in Mc. We put Wi(x4, xc) = mc. Then the 
equation 


WoW (Xa, XC) = Ve(mc) = Xe = Pc(Xas XC) = PcoWB(XaA, XC) 


shows that the lower right square commutes. 

Now all the squares commute, and all the rows and columns are exact except 
possibly the third column. Corollary 4.8 allows us to conclude that the third 
column is exact, and the proof of the lemma is complete. 


PROOF OF THEOREM 4.22. The main step is to construct projective resolutions 
of A, B, and C by an inductive process in such a way that the three resolutions to- 
gether form an exact sequence of chain complexes. We start by forming projective 
resolutions 


€ a a 
0 < AES XG eX 


EC Yo jAl 


and 0 <—C < Z\ < 


Zo < 
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Replacing X; by Ma = kerao and Z; by Mc = kery, we are led to the 
starting diagram in Lemma 4.23. Application of the lemma produces a short 
exact sequence 


02 BS Xp 6 Fy oe Mp — 0 


and the vertical maps g, and yw that make the squares commute in the lemma. 
Next we move everything one step to the right, applying the lemma to a diagram 
as in the lemma with first and third rows 


a inc 
O <— kere, <— X, <— kera, <0 
Yo inc 
and 0 < keréc <— Z, <— kery, <— 0 


and with an exact sequence in the first column involving the maps g and wW. 
Application of the lemma produces a short exact sequence 


O <— kereg ees a, <= ker By <— 0 


and the vertical maps g2 and yw that make the squares commute in the lemma. 
We can put these steps together to form the following diagram with exact rows 
and columns and with commuting squares: 


0 0 0 0 
0 < Age Xo Pace Xy pee kera; <—— 0 
g ix ix, G2 
0 < Beets eee Fe ire, a ep dn 
v PZ PZ, 2 
0 < Gus Zo pacha Z| re ker yy <—— 0 
0 0 0 0 


We can repeat the use of Lemma 4.23, starting from the last column of the above 
diagram and more of the projective resolutions of A and C,, and then we can merge 
the new result with the diagram above to obtain a diagram with one additional 
column. Continuing in this way, we arrive at three projective resolutions and 
vertical maps that together form an exact sequence of chain complexes. 
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To obtain a long exact sequence for our derived functors, we apply the functor 
F to the final diagram above, except that we drop the left column of 0’s and the 
column containing A, B,C. After the application of F’, the remaining columns 
are still exact because the columns in C are split and because F sends split 
exact sequences to split exact sequences.'° Then we apply Theorem 4.7, taking 
Proposition 4.21 into account, and the long exact sequence results except for the 
one detail of the 0 at the end. In other words, we still have to prove exactness 
at F(C). But exactness at this point is immediate from the assumed one-sided 
exactness of F. This completes the proof. 


Before addressing the functoriality of the association in Theorem 4.22, let us 
record the corresponding result when the derived functor makes use of injective 
resolutions. 


Theorem 4.24. Let F : C — C’ be an additive functor between two good 
categories. Suppose that F either is contravariant right exact or is covariant 
left exact, and suppose that C has enough injectives. Whenever there are three 
modules and two maps in C forming a short exact sequence 


O—A 2B YC > 0, 


then the derived functors of F on the three modules form a long exact sequence 
in C’ as follows: 


(a) If F is contravariant and right exact, then the long exact sequence is 
OSA) eB) PC) Fa) i aC) 
<— F(A) — F,(B) — F,(C) <— F3(A) — + 
(b) If F is covariant and left exact, then the long exact sequence is 
0 — F(A) — F(B) — F(C) — F'(A) — F!(B) — F'(C) 
—+.F*(A) —S FBS °C) SS FA) 


PROOF. The necessary modifications to the proof of Theorem 4.22 are fairly 
straightforward, but some comments are in order concerning how Lemma 4.23 is 
to be modified. In the diagram in the statement of Lemma 4.23, all the horizontal 
arrows are to be reversed, the projectives P,4 and Pc are to be replaced by injectives 


10 split exact sequence is the union of two four-term exact sequences from each end, and F is 
exact on each of these. In addition, we saw in Section 2 that F respects direct sums. It follows that 
F carries split exact sequences to split exact sequences. 
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I4 and Ic, and Ma and Mc are the quotients M4 = J4/ea(A) and Mc = 
Ic/ec(C). Let us define eg. Since J, is injective, choose h : B > I, with 
hg = €,4, and put eg(b) = (h(b), ecw(b)). Then the equation 


epy(a) = (hea, ecwea) = (€a(a), 0) = taea(a) 
says that the upper left square commutes, and the equation 


ecW(b) = pc (hb), ecw(b)) = pces db) 


says that the lower left square commutes. 

To see that €g is one-one, let eg(b) = 0. Then 0 = pceég(b) = Ec (b). 
Since €¢ is one-one, y(b) = 0, b lies in ker yw = image ¢, and b = g(a). Then 
O = eg(b) = egg(a) = igéa(a), and a = O because i, and €, are one-one. 
Hence b = g(a) = 0, and €g is one-one. 

Let Mg = (la @ Ic)/€p(B), and let Wa : [4 6 Ic — Mz be the quotient 
map. To define g;, we let g,(m4) = Wa(x4, 0) if m4 = Wax, with x4 € Ia. 
If x’, is another preimage of m, under Wai then x’, — x4 = &,(a) for some 
a € A, and Wp(x4,0) — We(x4,0) = Waiaea(a) = Weese(a) = 0; hence 
gy, 1s well defined. Since Wgigxg = Wp(xX4,0) = QimM, = YO WaXxa, the 
upper right square commutes. To define yy, let mg € Mg be p(x, xc), and 
define ¥4(mg) = Wc(xc). If (x44, x¢) is another preimage of mg under Was 
then (x44, x6) — (%4,%c) = €p(b) for some b € B, and Wc(xG) — We(Xc) = 
We Pc, XC) — VePc(®a, Xc) = Ve Pcéslh) = WeecW(b) = 0; hence yr is 
well defined. Since Wc pc(*4,Xc) = We(c) = Wilms) = Wie a, xc), the 
lower right square commutes. 

Now all the squares commute, and all the rows and columns are exact except 
possibly the third column. Corollary 4.8 allows us to conclude that the third 
column is exact, and the proof of the analog of Lemma 4.23 for injectives is 
complete. Theorem 4.24 then follows routinely. 


Theorem 4.25. Let F : C — C’ be an additive functor between two good 
categories. Suppose that F either is covariant right exact or is contravariant left 
exact, and suppose that C has enough projectives. Then the passage as in Theorem 
4.22 from short exact sequences in C to long exact sequences of derived functor 
modules in C’ is functorial in the following sense: whenever 


() A 2s Bp eG > 0 


fa fo | fe| 
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is a diagram in C with exact rows and commuting squares, then the long exact 
sequences of derived functors of F on A, B,C and A, B, C make commutative 
squares with the maps induced by the derived functors on f4, fz, fc. 


PROOF. The proof of Theorem 4.22 involved constructing a diagram 


0 0 0 0 
Oa A LG = xy = Kee fe 


e| ixo| ix | ix, 
+ 


Oe eS FB Tp = NT eS OD eS 


¥| P| sal PZ2 |) 


.———¢ . Zo pels Z\ es Z2 <— ss 
| | | | 
0 0 0 0 


with exact rows and commuting squares in which each X,, and Z,, is projective, 
and a similar diagram corresponds to the given short exact sequence with tildes 
on it. The present theorem will follow from the functoriality in Theorem 4.10 
if we can arrange that these two diagrams can be embedded in a 3-dimensional 
diagram with each of these diagrams in a horizontal plane and with vertical maps 
from the one diagram to the other such that all vertical squares commute. 

We are given vertical maps f4, fg,and fc, which we can regard as extending 
from the diagram with tildes to the other diagram. In addition, Theorem 4.12 
gives us chain maps { fx,} and {fz,} with fy_, = fa and fy_, = fc, and all the 
completed vertical squares in the 3-dimensional diagram commute. To complete 
the proof, we construct by induction forn > 0Oamap f, : X; @Z, > Xn OZ, 
such that 


Pz, fn = £2,PZ,> fniz, = tx, fx,) Bitte = aera. 7) 


with the understanding that B_; = ¢g. To make it possible for the inductive step 
to include the starting step of the induction, let us write X_; = A, Z_; = B, 
ix. = %, pz, = W,a_) = 4, y-1 = &c, and f_; = fg. Also, let us 
understand any module or map with subscript —2 to be 0. 
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We shall construct f,. For Z € Zn, we apply pz,_, to the difference 
Bn—-1(0, fz,2) _ fn—1Bn-1 0, Z) and get 


P21Bn—10, f2,2) — PZ, fr—1B—1 0, 2) 

= Yn-1Pz,0, £2,2) — f2q-1 PZ,_,Bn-1O, 2) 

= ¥n—1f2,% — f2,1¥n-1PZ, O, 2) 

= fei = fe is =. 
Thus Bn—1(0, fz,2) - fr—1Bn—10, 2) = iy, _,(x) fora unique x € X,_;, and we 
define t : Z, — X,_ by saying that t(Z) should be this x. This makes 

ix,,7@) = Bn—10, fz,2) — fn—1Bn—10, 2). 

Setting up the diagram 


Zn 
|: “og 
ay 
An-2 Qn-1 
Xn-2 < Xn-1 < Xn 


we prepare to invoke Lemma 4.11. We have 


ix, 2@n—2T(Z) = Bn—rix, ,T@ = Bn—2Bn—1, fz,2) — Bn—2.fn—1Bn—10, 2) 
= 0 fi—2Bn—2Bn-10, 2) = 0. 
Since ix,_, 18 one-one, @,_2T = 0, and Lemma 4.11 applies. Thus we obtain 
0: Zn > Xy With a,_10 = tT, and o satisfies 
ixX,1%n—10 @) = Bn-10, fz,2) — fr—1Bn10, 2). Ce 
Define 
Sn, 2) = (fx, (%) — 0 @), fz, @).- (1) 


With f, defined, we are to prove the three formulas (). For the first formula 
in (x), we apply pz, to both sides of (+) and obtain pz, fx(x,z) = fz,@ = 
fz, Pz, (x, Z), which is the desired formula. The second formula in (x) at X is just 
(+) with z = 0. 

We are left with proving the third formula in («). Using the second formula in 
(*«), we have 


Biign 0) = Bn—1 fniz, @) = Bn—1ix, fx, (%) 
= xe nate OY Pe A a) 
pple fog 3G) = a7 Brite 


= fn—1Bn—1&, 0). Gap 
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Also, 


Bn-1 fn, 2) _ —Bn—1ix, 0 @) ay Bn-1 0, fz, (Z)) by (tT) 
= —iy, ,n—10(Z) + Br_1 0, fz, (@) by commutativity 
= fn—1Bn—10,2) by (««). 


Adding this equality and (++), we obtain the third formula of («). This completes 
the proof. 


The version of Theorem 4.25 appropriate for Theorem 4.24 is the following, 
and its proof is similar. 


Theorem 4.26. Let F : C — C’ be an additive functor between two good 
categories. Suppose that F either is contravariant right exact or is covariant left 
exact, and suppose that C has enough injectives. Then the passage as in Theorem 
4.24 from short exact sequences in C to long exact sequences of derived functor 
modules in C’ is functorial in the following sense: whenever 


0 SA ee PEG > O 


li gil zal 


0 ey pekeee oe SO i > 0 


is a diagram in C with exact rows and commuting squares, then the long exact 
sequences of derived functors of F on A, B,C and A, B, C make commutative 
squares with the maps induced by the derived functors on fa, fs, fc. 


We come to an important application of the long exact sequences in Theorems 
4.22 and 4.24. Projective and injective resolutions make it easy to work with de- 
rived functors theoretically, but in practice any computations with them are likely 
to be difficult. It is therefore convenient to be able to compute derived functors 
from other resolutions than projective and injective ones.'! For definiteness let 
us work with the case of a covariant /eft exact functor in a good category with 


'lThe case of sheaf cohomology illustrates this point well. The present theory extends from 
good categories of modules to arbitrary abelian categories along the lines of Section 8 below, and 
the cohomology theory of sheaves fits into this more general framework. One additive functor 
of interest with sheaves is the “global-sections” functor. Its derived functors can be formed with 
injective resolutions, built from “flabby” sheaves, but flabby sheaves as a practical matter are too 
big to be useful in computations. In the theory of several complex variables for example, one 
approach is to substitute “fine” sheaves in resolutions; these permit computations and fall under the 
abelian-category generalization of Theorem 4.27 below. 
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enough injectives; this is the most important case in applications, and the other 
three cases in Figure 4.5 can be handled in similar fashion. Let F : C > C’ be an 
additive functor between good categories that is covariant left exact. A module 
M in Cis said to be F-acyclic if F"(M) = 0 for all n > 1. Every module M@ 
that is injective in C is F-acyclic, since 0 —> M —>» M —-+ 0 isan injective 
resolution of M from which we can see that F”(M) = Oforn > 1. An F-acyclic 
resolution of a module A in C is a resolution X = (A —> X7) in which X,, is 
an F-acyclic module for all n > 0. 


Theorem 4.27. Let C and C’ be two good categories, let F’ be an additive 
functor from C to C’ that is covariant and left exact, and suppose that C has enough 
injectives. If a module A in C has an F-acyclic resolution X = (A —> XT) 
and if 1 = (A —> !7*) is any injective resolution of A, then any cochain map 
f : X — I with f_; = 1, induces an isomorphism F”(A) = H"(F(X)) for 
each n > 0. 


REMARKS. Such a cochain map always exists and is unique up to homotopy, 
according to Theorem 4.16. Theorem 4.27 says that the derived functors of 
F on any module A can be computed from any F-acyclic resolution of A; it 
is not necessary to work only with injective resolutions. The same result as 
in the theorem holds with F,(A) = H,(F(A)) if F is contravariant and right 
exact. If F is covariant right exact or contravariant left exact and if C has 
enough projectives, then any chain map from a projective resolution of A to 
an F-acyclic resolution!” induces an isomorphism of the derived functors of A 
with the homology or cohomology of F of the F-acyclic resolution. 


PROOF. The injective resolution is at our disposal, according to Corollary 
4.17. Using the hypothesis that C has enough injectives, choose for each n an 
injective J, containing X,, let g, : X, — J, be the inclusion, and make {J,,} 
into an injective resolution of 0 with coboundary maps 0. Then replace / in the 
assumptions by J @ J and f by (f, g). The result is that we have reduced the 
theorem to the case that f is one-one. Changing notation, we may assume from 
the outset that the injective resolution is J = (A —> J+) and that the chain map 
f : X — Tis one-one in each degree. 

Put Y, = In/fn(Xn) = coker f,. The sequence 


fn 


0—> X, > I, > Y, — 0 (*) 


is exact, and Theorem 4.24a shows that the sequence 
F'n) —> F*(Yn) —> F'(Xn) 


For this situation, F-acyclic resolutions are understood to be chain complexes rather than 
cochain complexes. 
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is exact for every k > 0. Since J, and X, are F-acyclic for n > 0, the end terms 
are 0 for all k > 1. Consequently Y,, is F-acyclic for alln > 0. 

Referring to (*«) form and for n + 1, we see that the coboundary map from J, 
to [,41 induces a compatible coboundary map from Y, to Y,4;. Thus we may 
consider Y = (0 —> Y7) as acochain complex with Yt = {Y,},>0. Then the 
equations (*) for all n > 0, together with the coboundary maps, make 


0— xX oe >Y—>0 (+) 


into a short exact sequence of complexes. Since X and / are exact, Corollary 4.8 
shows that Y is exact. 

If we apply F to the short exact sequence of complexes (+), we obtain a 
planar diagram 


F(f) 


0 — F(X) > F(1) > F(Y) > 0 (7) 


whose rows are the result of applying F to («), whose columns are complexes, 
and whose squares commutes. As usual we drop the row for n = —1, replacing 
it with a row of 0’s. Let us prove that (+) is in fact a short exact sequence of 
complexes. In fact, the result of applying F to (x) is the long exact sequence that 
begins 

0 — F(X,) —> F(Ih) — F(%,) — F'(X,). 


For n > 0, X, is F-acyclic. Thus F'(X,) = 0, and the exactness for n > 0 
follows. Forn < —1, the rows of the diagram (7) are 0 and hence are exact. Thus 
(}) is a short exact sequence of complexes. 

We shall now prove that F(Y) = (0 —> F(Y7)) is exact. Combining this 
fact with the exactness of the rows of (+) and applying Corollary 4.8 will then 
yield H"’(F(X)) = H”"(F(/)) for all n > 0. Since H"(FU)) = F"(A), this 
step will complete the proof. 

To prove that F(Y) = (0 —> F(¥7)) is exact, define Zo = Yo and Z, = 
coker(Y,_; — Y,) forn > 1. Let d, : Y, — Yn41 be the coboundary map. For 
each n > 0, the complex 


0 — Y,/kerd, — Yn41 — Zn41 — 0 


is exact. Since kerd, = imaged,_; by exactness of Y, we have Y,/kerd, = 
Y, /image d,_1 = Z,, and thus 


0 — Zn — Yni1 — Znt1 — 0 (TT) 


is exact for all n > 0. 
Let us use (+7) to prove the preliminary result that Z, is F-acyclic for all 
n > 0. Forn = 0, Zp = Yo, and Yo is known to be F-acyclic. Proceeding 
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inductively, suppose that Z, is known to be F-acyclic. Applying Theorem 4.24a 
to (fT), we see that 


F¥ (Yn41) —> F¥(Zn41) —> FM" (Z,) 


is exact for alln > Oandallk > 0. Forn > Oandk > 1, the left end is 0 because 
Y,,41 is F-acyclic, and the right end is 0 because Z,, is F-acyclic by the inductive 
hypothesis. Therefore the middle term is 0, Z,4, is F-acyclic, and the induction 
is complete. 

Theorem 4.24a when applied to (}+) shows that 


O—> F(Z,) —> FQnut) —> Fn) > Fi (Zn) 


is exact for all n > 0, and we now know that the term at the right end is 0. 
Therefore 
eZ) ee ee) 0 (4) 


is exact for alln > 0. 
Now we can prove that the complex 


Oe) OP) i Oa) es CEE) 


is exact at each module F'(Y,,). We know from Section 2 that we can merge two 
exact sequences 


soe > F(Yng1) > F(Zn41) > 0 and O-> F(Zy41) > F(Yns2) > --- 
into a single exact sequence 
Pt) ee 


Consequently inductive application of (£) shows that the sequence 
O — F(Zo) — F(%1) — F(%2) — +--+ > Fn) — F(Za41) > 0 


is exact for each n > 0. In addition, we know that Zo = Yo by definition. 
Therefore (£4) is exact at F(Y,,) for each n > 0, and the proof is complete. 


Theorems 4.22 and 4.24 produce a long exact sequence from one additive 
functor and a short exact sequence of modules. Although it may at first seem odd 
to do so, we can obtain a different long exact sequence by varying the functor 
and fixing the module. This result, given as Proposition 4.28 below, will be used 
in the next section in analyzing the Ext and Tor functors. 
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Let C and C’ be two good categories, and let F, G, H be three additive functors 
from CtoC’. For definiteness, suppose that F, G, H are covariant and right exact. 
Suppose that there is a natural transformation S of F into G and there is a natural 
transformation T of G into H. We say that the sequence 


Peon 
is exact on projectives if for every projective P in C, the sequence 
6S FP) 2s Ges re 0 


is exact. Analogous definitions are to be made with projectives or injectives for 
the three other kinds of derived functors as in Figure 4.5. 


Proposition 4.28. Let C and C’ be two good categories, let F, G, H be three 
additive functors from C to C’, suppose that F, G, H are covariant and right exact, 
and suppose that C has enough projectives. If there are natural transformations 


S:F — GandT:G — H such that the sequence F +, G 4 BF isexact 
on projectives, then the derived functors of F, G, H on each module A in C form 
a long exact sequence 


0 <— H(A) <— G(A) <— F(A) < GA,(A) <— G,(A) <— F)(A) 
<— H(A) <— G2(A) <— F(A) <— H3(A) <— ---. 
The passage from A to the long exact sequence is functorial in A. 


REMARKS. The same long exact sequence and functoriality hold with the 
arrows reversed and F and H interchanged if the three functors are contravariant 
and left exact. If F, G, H are contravariant and right exact or are covariant and 
left exact, then analogous conclusions are valid provided C has enough injectives 
and the natural transformations S and T are exact on injectives. 


Proor. If P = (P+ —+ A) is a projective resolution of A, then the natural 
transformations S and T give us a planar diagram 


0 0 0 
i ij it 
Os Ry A. Gps BS = 0 


T T T 


S T, 
C=) Cay Ss FP 


T T T 
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in which the columns are complexes, the rows are exact because the sequence 


F > G 5 Fis exact on projectives, and the squares commute because S$ 
and T are natural transformations. The construction of the long exact sequence 
then follows from Theorem 4.7. 

For the functoriality, suppose that g : A — A’ is a map between two modules 
of C. Let P = (Pt —+ A) and P’ = (P’t —> A) be projective resolutions 
of A and A’, and use Theorem 4.12 to extend g to a chain map {¢,} of P to 
P’. Then the planar diagrams as above for P and P’ can be embedded in a 
3-dimensional diagram in such a way that the various maps F'(¢,), G(g,), and 
H(g,) connecting the diagram for P to the diagram for P’ make all squares 
commute. The functoriality now follows immediately from Theorem 4.10. 
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In this section we study the derived functors of Hom and tensor product. Although 
we shall treat each as carrying unital left R modules, where R is aring with identity, 
to abelian groups, the theory applies also to more complicated versions of Hom 
and tensor product, such as when one of the R modules in question is actually 
a bimodule for the rings R and S and the result of Hom or tensor product is an 
S module. Problems 9-11 at the end of the chapter address the situation with 
bimodules. 

We know that Hom,(A, B) is acontravariant left exact functor of the A variable 
and a left exact covariant functor of the B variable. Thus we have two initial 
choices for inserting resolutions and creating derived functors, namely 


Ext’,(A, B) = H"(Homa(P, B)), with P = (A < P*) projective, 
and 
extp(A, B) = H"(Homa(A, 1)), with J = (B > I*) injective. 


Existence of the first one depends on having enough projectives in the category 
of the A variable, and existence of the second one depends on having enough 
injectives in the category of the B variable. Each of these, just as with Hom, 
depends on two variables, one in contravariant fashion and the other in covariant 
fashion. Thus Ext and ext are not functors of two variables in the strict sense of 
our definitions. Instead, they are examples of “bifunctors,” of which Homg(.-, -) 
is the prototype, and the main result, Theorem 4.31 below, in essence says that 
Ext and ext are naturally isomorphic as bifunctors, provided the first domain 
category has enough projectives and the second has enough injectives. Among 
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other things this natural isomorphism will justify and explain how we were able 
to define cohomology of groups in more than one way.!? 

In the case of tensor product A @r B, similar remarks apply. Here A is a 
unital right R module, and B is a unital left R module. The module A in a natural 
way is a unital left R° module, where R° is the opposite ring of R, and thus 
tensor product is to be regarded as defined on the product of two categories of 
left modules just as Hom is. We can regard tensor product as an actual functor in 
either variable, and the functor is covariant right exact in both cases. Again we 
have two initial choices for inserting resolutions and creating derived functors, 
namely 


Tor*(A, B)= H"(P @r B), with P = (A < P*) projective, 
and 
tor’ (A, B) = H"(A @p P), with P’ = (B < P’*) projective. 


These exist if the domain categories have enough projectives. Both Tor and tor 
can be considered as covariant functors of two variables, or else as “bifunctors,” 
and one can show in the same way as for Ext and ext that Tor and tor are naturally 
isomorphic. There is no need to write out the details. It is customary to write Tor 
for the common value. 


Proposition 4.29. Let C and C’ be good categories of unital left R modules, 
and suppose that C has enough projectives. Then the contravariant left exact 
functors Home(-, B) from C to Cz and their derived functors Ext’,(-, B) have 
the following properties: 


(a) Whenever 0 — A’ > A — A” -> O isa short exact sequence in C, then 
there is a corresponding long exact sequence 


0 —> Homeg(A”, B) —> Home(A, B) —> Home(A’, B) 
—> Extp(A”, B) —> Ext} (A, B) —> Ext} (A’, B) 


—> Ext?,(A”, B) —> Ext?,(A, B) —> Ext?,(A’, B) > Ext}(A”, B) > --- 


in Cz foreach module B inC’. The passage from short exact sequences in C to long 
exact sequences of derived functor modules in Cz is functorial in its dependence 
on the exact sequence in the first variable in the sense of Theorem 4.25 and is 
natural in the second variable in the sense that if a map 7 : B — B is given, then 
Hom(1, 7) defines a chain map from the long exact sequence for B to the long 
exact sequence for B. 


'3]t would add only definitions to our discussion to say precisely what a general bifunctor is and 
what a general natural transformation between bifunctors is, and we shall skip that detail, in effect 
incorporating the definitions into the theorem. 
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(b) If P is a projective in C and J is an injective in C’, then Extp(P, B) =0 = 
Ext',(A, J) for all n > 1 and all modules A in C and B inC’. 


(c) Whenever 0 — B’ — B — B” -> Oisa short exact sequence in C’, then 
there is a corresponding long exact sequence 


0 —> Homag(A, B’) —> Home(A, B) —> Home(A, B”) 
—> Extp(A, B’) —> Extp(A, B) —> Extp(A, B”) 


—> Ext’,(A, B’) —> Ext%,(A, B) —> Ext%,(A, B”) > Ext3,(A, B’) > --- 


in Cz for each module A in C. The passage from short exact sequences in C’ to 
long exact sequences of derived functor modules in Cz is functorial in the exact 
sequence in the second variable and is natural in the first variable in the sense that 
if a map 7 : A > Ais given, then Hom(n, 1) defines a chain map from the long 
exact sequence for A to the long exact sequence for A. 


REMARKS. The naturality in the B parameter of the construction of the long 
exact sequence in (a) implies that Ext’, is a covariant functor of the second variable 
for fixed argument of the first variable. It implies also that all maps Extp(q, 1) 
commute with all maps Extp(1, B). 


PROOF. For (a), Theorem 4.22b gives the exact sequence, and Theorem 4.25 
proves the functoriality in the first variable. For the naturality in the second 
variable, let 7 : B — B be given. The proof of Theorem 4.22 produces a 
short exact sequence of projective resolutions of A’, A, A” to which the functor 
in that theorem is then applied. We now have two such functors Home(-, B) 
and Homr(-, B), and the maps within each image diagram are all of the form 
Hom(a, 1). The two diagrams fit into a 3-dimensional diagram, and the maps 
between the two diagrams are of the form Hom(1, 7). Since all maps Hom(a, 1) 
commute with all maps Hom(1, ), the 3-dimensional diagram is commutative. 
The corresponding long exact sequences are then related by a cochain map ac- 
cording to Theorem 4.10. 

For (b),0 < P < P < Oisa projective resolution of P, and hence any 
derived functor that is defined by projective resolutions is 0 in degree > 1. In 
addition, Proposition 4.19b shows that Homa(- , /) is an exact functor, and hence 
its derived functors are 0 in degree > 1. 

For (c), we shall apply Proposition 4.28 in its version for contravariant left exact 
functors. Let g : B’ > B and y : B > B" be the maps in the given short exact 
sequence, and let F, G, H be the functors with F(A) = Hom,(A, B’), G(A) = 
Hom,(A, B), H(A) = Home(A, B”). Then we have a natural transformation S 
of F into G given by $4 = Hom(1, g) and a natural transformation T of G into 
H given by T4 = Hom(1, vv). Since 


0 —> Homa(P, B’) 2 Homa(P, B) —> Homa(P, B”) —> 0 
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is exact by Proposition 4.19a, the sequence 
F=2>¢G—-5H 


is exact on projectives. Proposition 4.28 in its version for contravariant left exact 
functors then says that there is a long exact sequence 


0 —> F(A) —> G(A) —> H(A) — F\(A) — G\(A) — A, (A) 


and that the passage to this long exact sequence is functorial in A. This much es- 
tablishes the long exact sequence in (c) and the naturality in the A variable. For the 
behavior in the second variable with A fixed, suppose that we have a second exact 
sequence 0 + B’ > B — B" — Othat maps to the given one by achain map f. 
Let F’, G’, H’ be the functors Homr(-, B’), Homr(-, B), Home(-, B”). We 
then get two horizontal planar diagrams of the kind in the proof of Proposition 
4.28, one for F’, G’, H’ and one for F,G, H. The maps within each of the 
two diagrams are maps in the A variable. The two diagrams embed in a 3- 
dimensional diagram with vertical maps Home(1, f), and the 3-dimensional 
diagram is commutative because all maps Hom(a, 1) commute with all maps 
Hom(1, 8). Application of Theorem 4.10 then completes the proof of functori- 
ality in the exact sequence in the second variable. 


Proposition 4.30. Let C and C’ be good categories of unital left R modules, 
and suppose that C’ has enough injectives. Then the covariant left exact func- 
tors Hom,(A, -) from C’ to Cz and their derived functors ext, (A, -) have the 
following properties: 


(a) Whenever 0 — A’ — A — A” -> 0 isa short exact sequence in C, then 
there is a corresponding long exact sequence 


0 —> Home(A”, B) —> Home(A, B) —> Home(A’, B) 


—> ext}, (A”, B) —> ext},(A, B) —> ext}(A’, B) 


—> ext?,(A”, B) —> ext,(A, B) —> ext?,(A’, B) > ext3,(A”, B) > «+: 


in Cz foreach module B inC’. The passage from short exact sequences in C to long 
exact sequences of derived functor modules in Cz is functorial in its dependence 
on the exact sequence in the first variable and is natural in the second variable in 
the sense that if a map 7 : B — B is given, then Hom(1, 7) defines a chain map 
from the long exact sequence for B to the long exact sequence for B. 
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(b) If P is a projective in C and J is an injective in C’, then ext,(P, B) =0 = 
ext,(A, J) for all n > 1 and all modules A in C and B inC’. 


(c) Whenever 0 — B’ — B — BY” -> Oisa short exact sequence in C’, then 
there is a corresponding long exact sequence 


0 —> Hom,(A, B’) —> Home(A, B) —> Homa(A, B”) 
—> ext},(A, B’) —> extp(A, B) —> ext, (A, B”) 


—> ext?,(A, B’) —> ext%(A, B) —> ext (A, B”) > ext},(A, B’) > --- 


in Cz for each module A in C. The passage from short exact sequences in C’ to 
long exact sequences of derived functor modules in Cz is functorial in the exact 
sequence in the second variable and is natural in the first variable in the sense that 
if a map 7 : A > Ais given, then Hom(y, 1) defines a chain map from the long 
exact sequence for A to the long exact sequence for A. 


REMARKS. The naturality in the A parameter of the construction of the long 
exact sequence in (c) implies that ext’, is a contravariant functor of the first variable 
for fixed argument of the second variable. It implies also that all maps ext, (a, 1) 
commute with all maps ext’, (1, 6). 


PROOF. The proof of (c) is a simple variant of the proof of Proposition 4.29a, 
the proof of (b) is a simple variant of the proof of Proposition 4.29b, and the proof 
of (a) is a simple variant of the proof of Proposition 4.29c. 


Propositions 4.29 and 4.30 show that Ext and ext, as functors of the first variable 
and as functors of the second variable, generate the same long exact sequences, 
the first under the assumption that C has enough projectives and the second under 
the assumption that C’ has enough injectives. Theorem 4.31 will show that Ext 
and ext may be treated as equal if both assumptions are satisfied. It is customary 
therefore to use Ext as the notation in both cases; thus Ext exists if either C has 
enough projectives or C’ has enough injectives. In both cases, Ext has a long 
exact sequence in the first variable and another long exact sequence in the second 
variable. 


Theorem 4.31. Let C and C’ be good categories of unital left R modules, 
and suppose that C has enough projectives and C’ has enough injectives. Then 
Ext’p(-, -) and ext%(-, -) are naturally isomorphic from C x C’ to Cz in the 
sense that for each n > 0 and each pair of modules (A, B) inC x C’, there exists 
an isomorphism 7(,,4,8) in Homz(Extp(A, B), ext,(A, B)) such that if g is in 
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Homp(A, A’) and w is in Hom,r(B, B’), then the diagrams 


Tn, A.B) 
—_—> 


Ext’,(A, B) ext,(A, B) 


Ext" (g, v] Jo (g,1) 


Ext(A’, B) 8 exts(A’, B) 
R ’ R ’ 


and 
Tn, A.B) 
———> 


Ext’, (A, B) ext’,(A, B) 


Ewa | [ewan 


Tin, A,B!) 
ss 


Ext’,(A, B’) ext,(A, B’) 


commute. 


REMARKS. The reader will be able to observe that a certain part of this proof 
amounts to showing that 3-dimensional diagrams in the shape of a cube having 
5 faces equal to commuting squares and having suitable hypotheses on the maps 
automatically have their sixth face equal to a commuting square. The hypotheses 
concerning the faces and the maps come from Propositions 4.29 and 4.30, as well 
as induction. We shall not try to abstract a general result of this kind, however. 


PROOF. We induct on n for n > 0. Several steps are involved in the proof, and 
we complete all of them for a particular n before going on ton + 1. The steps for 
a particular n are 


(i) to define T(,,4,8) in the presence of an injective J and a one-one map 
pu: B — I and to observe that T,,, 4,8) is an isomorphism, 
(ii) to show that the same 7(n,4, 8) results independently of the choice of J, 
(iii) to prove the commutativity of the second diagram in the statement of the 
theorem, and 
(iv) to prove the commutativity of the first diagram in the statement of the 
theorem. 


The first base case of the induction is n = O, for which we take T,o, 4, p) to be the 
identity on Hom,(A, B). Then (i) through (iv) are immediate. 

The other base case of the induction isn = 1. Let (A, B) be given. An 
injective J and a one-one map «4 : B — I exist as in (i) because C’ has enough 
injectives. Then we have an exact sequence 


7 


OS. 83 5 OC =s0 (x) 


in which C = I/(B) and v is the quotient map. We know from Propositions 
4.29b and 4.30b that Ext (A,J) =0= exth(A, I). Therefore Propositions 
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4.29c and 4.30c give us exact sequences 


Homa(A, 1)” Homa(A, C) "> Ext)(A, B) ——> 0 


and 
Hom(1,v) Me,0 1 
Hom, (A, 1) ———> Homa(A, C) ——— extp(A, B) ——~ 0 
in which w¢,9 and @e,o are suitable connecting homomorphisms. We define 
T,A,B) = We.0(ME0) |. This definition is meaningful, since the exactness of the 
two sequences gives 


(we.0) '(0) = kerweo = Hom(1, v)(Home(A, /)) = ker ao; 


by an analogous computation, WE,0(@e.0) | is a well-defined function, and it is 
evidently a two-sided inverse. Thus 7,1, 4,z) is an isomorphism. This completes 
step (i). 

In order to be able to handle steps (ii) and (iii) without being repetitive, let a 
map wy : B — B' be given. For (ii), B’ will be B, and y will be the identity on 
B. For (iii), B’ and w will be general. Given y and one-one maps uw: B > I 
and yu’ : B' + I', we can form the exact rows and the first column of the diagram 


0 eee aoe > 0 


“| al fl (4) 


La 6 

If we think of J and /’ as extended to injective resolutions, Theorem 4.16 allows 
us to fill in a cochain map from the one extension to the other, and the first new 
step of that cochain map is f. If we define f = v’ fv~', then f is well defined 
because 


v' f ker v = v’ f image ju 
v' f u(B) = v'p' p(B) = 01 (B)) = 0, 


and the squares of the diagram (**) now commute. Continuing with the effort 
to cut down on repetitive arguments, let k > 1 be an integer that will be 1 when 
n = 1 and will be different later in the proof. Applying Proposition 4.29c to (+>) 
gives us a commuting square 


v' fv-1(0) 


II 


WE,k-1 


Ext&!(4,C) —— > Ext<(A, B) 
R R 
Ext (1, f) [extaw (+) 


Ext&!(4,C’) “> Ext&(A, BY) 
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for k > 1, and Proposition 4.30c gives us a similar commuting square for ext for 
k>1. 

For each module in the diagram with Ext when k = 1, there is a map to the 
corresponding module in the diagram with ext. These maps are Tx_1,4,c) for 
the upper left and T;,_1,4,c’) for the lower left. The maps for the upper right and 
lower right depend on the step of the argument. 

For step (ii), we are taking B’ = B, and the maps at the right are the two 
versions of Tix,4,8), one for the injective J and one for the injective /’. Let 
us call them T(x, 4,2) and TK.A.B)" We are to prove that Te A.B) Ext(1,y) = 
ext*(1, W)Ttx,a,p) for y = 1. The relevant definitions are 


Tk, A,B) = Oe-k-1) T4140) Ce RA) 
-1 
and Tik.A.B) = Oe k—1) Tk-1,4,C) (Oe k-1)) , 
or equivalently 
Tk, A,B)(E,k-1) = (e,k-1) Tk-1,4,C) 
and Tk A.B)UE.k—-1) a ek k-1,4,C')* 
Since T-1,4,c) and Tz.—1,4,c’) are known inductively to be well defined and to 
satisfy (iii), we have ext*!(1, A)TaK-1,4,0) = Tek-1,4,¢') Ext*—!(1, f). Thus 
ext®(1, W)Tu,A,By@cE,k—-1) = ext’ (1, Wek) TK-1,4,.0) 
Orpen Usfletig =O tetacet Ge) 
k-1 P k 
= Tk. A,B) (E,k-1) Ext (1, f) = Th. A.B) Ext’ (1, W)O(E,k-1)- 
Since Ext*(1, yw) = 1 and ext*(1, ¥) = 1 when y = 1, step (ii) follows for 
n= 1,ie., T%,4,B) 18 well defined. 

For step (iti), we are allowing general B’, and the maps at the right between 
the two versions of (}) are the well-defined isomorphisms Tix, 4,8) and T(x, 4,3’). 
We are to prove that T(x, 4,8) Ext*(1, Ww) = ext*(1, wW)Tx,a,p). The argument in 
the previous paragraph applies if we change Tj, 4 g) Systematically to T(x, 4,2") 
and take into account that w(z,,—1) is onto, and step (iii) follows forn = 1. 


For step (iv), let g : A > A’ be given. The conclusion of Proposition 4.29c 
that the dependence is natural in the first variable gives us a commuting square 


k-1 


Ext!(A,C) > Ext&(A, B) 
Ext“"@.0)] Jee. (+f) 


Ext*!(4",C) 4 Exth(a’, B) 
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for k > 1 and for suitable connecting homomorphisms wz ,—1 and Or, 41> and 
Proposition 4.30c gives a similar commuting square for ext for k > 1. For each 
module in the diagram with Ext when k = 1, there is a map to the corresponding 
module in the diagram with ext. These maps are T(,_1,4,c) for the upper left, 
T(x—-1,A',c) for the lower left, 7, 4,8) for the upper right, and T(,, 4’, 8) for the lower 
right. We are to prove that Tx, 4,8) Ext" (y, 1) = ext (9, 1)T x, a',B). The relevant 
definitions are 


Tk, A,B)OE,k-1) = @(¢,k-1) Tk-1,4,C) 
/ 
and Tk, A’ BC k-1) = ek Tk-1,4,.0)- 
Since Ty~1,4,c) and T(,—1,4",c) are known inductively to satisfy (iv), we have 
ext! (g, 1) Tu -1,4,c) = Tie-1,4,c) Ext’ '(g, 1). Thus 
k k 
ext" (g, 1) TK, B) Oe p—1) = EXE (Y, De 4-1) TH-1.4.0) 
k-1 - 
= we,e—1 ext (9, ITE -1,4,0) = ek) Tk-1,4,€) Ext’ "@, 1) 
= Ty,,B) cE k—-1) Ext’ "9, 1) = Te, a.) Ext*(, Doe, -1: 
Since w(- 4_1) is onto, step (iv) follows for n = 1. This completes the proof for 
n=l. 
For the inductive step, suppose that steps (i) through (iv) have been carried out 
for some n > 1. Let us carry out step (i) for stage n + 1. For a given B, we know 


from Propositions 4.29b and 4.30b that Extp(A, /) = 0 = ext(A, /). Hence 
Propositions 4.29c and 4.30c give us exact sequences 


0 ——> Ext?,(A, C) —““> Ext"*!(A, B) —> 0 
and 
n Wen n+1 
0 ——> extp(A, C) ——— ext," (A, B) —— 0. 


In other words, wz, and w¢,, are isomorphisms. If we put 
-1 
Tn+1,A,B) = WenT(n, ACO E n> 


then Tin41,4,8) is an isomorphism of Ext’,*!(A, B) onto ext’,'!(A, B). This 
completes step (i) for stage n + 1. 

We now refer back to our argument for n = 1 and put k = n + 1 throughout. 
Tracing matters through, we see that the argument carries out steps (ii) through 
(iv) for stage n + 1. This completes the induction and the proof. 
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8. Abelian Categories 


Not all situations in which one wants to apply homological algebra are limited to 
good categories of unital left R modules for some ring R. We have mentioned 
sheaves as one example, and we shall develop some properties of sheaves in Chap- 
ter X. Implicitly we have carried along a second example: all chain complexes 
within a good category, with chain maps as morphisms, form a category in which 
short exact sequences have remarkable properties, such as those in Theorems 4.7 
and 4.10. 

A setting to which one can generalize well such basic parts of homological 
algebra is that of “abelian categories;’ which we define in this section. It is 
advisable not to require that the objects in an abelian category actually be sets 
of individual elements; otherwise there is little chance that the notion of abelian 
category could be self dual. The morphisms of the category are then effectively 
all we have to work with, since a morphism already determines its “domain” and 
“range.” If X and Y are objects, then a morphism in Morph(X, Y) need not be a 
function, but at least Morph(X, Y) is a set with elements to it. Since objects no 
longer have elements, books usually suppress the objects in the discussion to the 
point of referring to things like kernels and cokernels as morphisms rather than 
objects. It is perhaps more comfortable to think of a kernel as a pair, consisting of 
an object and a morphism into another object, rather than just as the embedding 
morphism, and we shall follow the more comfortable convention temporarily. 

We introduce the notion of “abelian category” in stages. We begin with some 
definitions and remarks that make sense in a general category. First of all, let 
us have names for X and Y when referring to morphisms in Morph(X, Y) that 
do not require us to think in terms of functions. The convention is that if wu is 
in Morph(X, Y), then X is the domain of u and Y is the codomain. We allow 
ourselves to write compositions of morphisms as gf or as go f. 

Next, it is possible to generalize usefully the notions of “one-one” and “onto” to 
make them applicable in any category. The definitions are in terms of cancellation 
laws. In the category C, a morphism u € Morph(X, Y) is amonomorphism" if 
for any f and g in the same set Morph(W, X) such that uf = ug, it follows that 
f =. Any isomorphism is certainly amonomorphism. The composition of two 
monomorphisms is a monomorphism. In fact, if wu and v are monomorphisms 
with vuf = vug, then uf = ug because v is a monomorphism, and f = g 
because u is a monomorphism. If m is a monomorphism in Morph(X, Y) and u 
is any morphism in Morph(Y, X) such that mu = ly, then m is an isomorphism. 
In fact, mu = ly implies mum = lym = m, which implies um = 1x, since m 
is a monomorphism; therefore wu is a two-sided inverse to m. 


'4Some authors use the word “monic” or the word “mono” as an adjectival form of this noun. 
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The morphism u € Morph(X, Y) is an epimorphism! if for any f’ and g’ 
in the same set Morph(Y, Z) such that f’u = g’u, it follows that f’ = g’. Any 
isomorphism is an epimorphism. The composition of two epimorphisms is an 
epimorphism. If e is an epimorphism in Morph(X, Y) and u is any morphism in 
Morph(Y, X) such that ue = ly, then e is an isomorphism. 

Finally a zero object 0 in a category C is an object such that for each X in 
Obj(C), each of Morph(X, 0) and Morph(0, X) has exactly one member. It is 
immediate that any two zero objects are isomorphic: if 0 and 0’ are zero objects, 
then Morph(0, 0) and Morph(0’, 0’) each have just one member, which must be lo 
and 1 in the two cases; the composition of the member of Morph(0, 0’) followed 
by the member of Morph(0’, 0) must be 1o, and the composition in the other order 
must be lq, and the isomorphism of 0 with 0’ has been exhibited. 

Suppose that a zero object exists. Since the composition law for morphisms 
in C insists that the composite of a member of Morph(X,0) and a member 
of Morph(O, Y) be in Morph(X, Y), it follows that Morph(X, Y) has a distin- 
guished member, which we denote by Oxy. This is called the zero morphism of 
Morph(X, Y). By associativity it satisfies fOxy = Oxz forall f ¢ Morph(Y, Z) 
and Oxyg = Owy for all g € Hom(W, X). Since Morph(0, 0) has just one 
element, we have Ooo = lo. If X is any other object such that Morph(X, X) has 
Oxx = lx, then X is a zero object; in fact, the equalities 0x000x = Ooo = 1o and 
0ox0x0 = Oxx = ly show that X and 0 are isomorphic. 

An additive category C is a category with the following three properties: 


(i) Chas a zero object, 
(ii) the product and the coproduct!® of any two objects in C exists in C, 
(iii) each set Morph(X, Y) is an abelian group with the property that the 
operation is Z bilinear in the sense that if the operation is + and if f, f’ 
are arbitrary in Morph(X, Y) and g, g’ are arbitrary in Morph(Y, Z), then 


(g+e)0(f+fJ=H=gofteoftgof te of 
and go(-—f)=(-2) of =—(ge f). 


If C is an additive category, then so is the opposite category C°PP; this fact 
will enable us to use duality arguments occasionally. We shall henceforth write 
Hom(X, Y) in place of Morph(X, Y) for additive categories. 

The zero morphism Oxy of Hom(X, Y) is the additive identity 0 of the abelian 
group Hom(X, Y). In fact, Ooy is the additive identity of Hom(0, Y), since 
Hom(0, Y) has just one element. Therefore Oxy = Ooy0xo0 = (oy + Ooy)0x0 = 
Oov0x0 + OoyOxo = Oxy + Oxy, and we obtain 0 = Oxy. 


'5Some authors use the word “epi” as an adjectival form of this noun. 
'OThese are defined in Section IV.11 of Basic Algebra. They are always unique up to canonical 
isomorphism when they exist. 
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In an additive category a morphism u in Hom(X, Y) is a monomorphism if 
whenever uf = 0 with f in some Hom(W, X), then f = 0; a morphism wu in 
Hom(X, Y) is an epimorphism if whenever f’u = 0 with f’ insome Hom(Y, Z), 
then f’ = 0. 

This much structure forces products and coproducts to amount to the same 
thing in an additive category. The precise result is as follows. 


Proposition 4.32. In an additive category, let (C, p4, pg) be a product of two 
objects A and B. Then there exist unique i, € Hom(A, C) andig ¢ Hom(B, C) 
such that 


Paia=1,4, Ppip=1g, iapatisps=lc. 
These satisfy paig = Oand pgi, = 0,and (C,i,4, ig) isa coproduct of A and B. 


REMARKS. 

(1) Since the defining properties of an additive category are self dual, any 
coproduct has a similar structure and becomes a product. The proof in effect will 
show more—that whenever there are data A, B, C,i4,iz, pa, pp Satisfying the 
displayed identities, then (C, pa, pg) is a product of A and B, and (C, ia, ig) is 
a coproduct. Thus a product/coproduct can be recognized without reference to 
other objects in the category. 

(2) To emphasize the analogy with modules or vector spaces, we write A @ B 
for a product or coproduct of A and B in C and call it the direct sum of A and 
B. The notation is understood to carry the morphisms i,,ig, pa, pg along with 
it. The direct sum is unique up to an isomorphism that carries the one set of 
morphisms i4,ig, Pa, Pp to the other. 


PROOF. To the pair 14 € Hom(A, A) and 0 € Hom(A, B), the product C 
associates a unique i4 € Hom(A, C) with pai, = 1, and pgi, = 0. Similarly 
the coproduct associates a unique ig € Hom(B, C) with paig = 0 and pgig = 
1g. Computing with the aid of the Z bilinearity and associativity, we have 


paG@iapat+isps) = lapat+Opp= pa 


and PBp(iaPpa +ipps) =O0pat+|BpB = Ps. 


Therefore h = i,pa +igpz is a member of Hom(C, C) with the property that 
Pah = paand pgh = ps. Since Ic is another member of Hom(C, C) with this 
property, the assumed uniqueness shows that h = 1c. This proves the displayed 
formulas in the proposition and the formulas pig = 0 and pgi, = 0. 

For uniqueness of i, and ig, suppose that i’, andi}, satisfy i’, p4 +i, pe = lc. 
Right multiplication by i, givesi, = lcig = (iypatippadia =iyla+i,0 = 
i’,, and similarly ig =i,. 
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To see that (C,i4,ig) is a coproduct of A and B, let f ¢ Hom(A, X) and 
g € Hom(B, X) be given, and define h = fp4+gpe. Thisisin Hom(C, X), has 
hia = fpaia + gppia = fla = f.and similarly has hig = g. For uniqueness 
suppose that k is in Hom(C, X) with ki, = f andkig = g. Thenkiagp, = fpa 
and kig pg = gpg. Addition gives 


k=klco =k(iapatipps) = fpat gpg =h, 


and uniqueness is proved. 


For an additive category C, the notions of the kernel and cokernel of a morphism 
are defined by universal mapping properties. Problems 18-22 at the end of 
Chapter VI of Basic Algebra discussed universal mapping properties abstractly, 
saying what they are in a general context. For current purposes it is enough to 
know that what a universal mapping property produces (if it produces anything 
at all) is a pair consisting of an object and a morphism, and moreover the pair is 
automatically unique (if it exists) up to canonical isomorphism. 

We allow ourselves to write morphisms as arrows in any of the customary ways 
for functions. Thus a member u of Hom(A, B) may be written as A ake B,and 
a composition of u followed by a morphism v € Hom(B, C), which has been 
written as v o u or as vu, may be written as A —> B —> C. 

If A —> B is a morphism in the additive category C, then the kernel of u, 
denoted by ker u, is a pair (K,i) with i € Hom(K, A) such that the composition 
K —>» A —> B has ui = 0 and such that for any pair (K’, i’) with i’ in 
Hom(K’, A) for which ui’ = 0, there exists a unique g € Hom(K’, K) with 
ig = 1’. See Figure 4.6. It is customary to drop all mention of K in the definition 
of kernel, saying that the kernel is 7, since any mention of i carries along K as the 
domain of i; we shall adopt this abbreviated terminology shortly but shall refer 
to the pair (K, 7) as the kernel for the time being. 


K eee | “+B 
a 

eh OF 

i 


FIGURE 4.6. Universal mapping property of a kernel (K, i) of uw. 
The brief form of the definition of kernel is that u o (keru) = 0 and 
ui’ =0 implies i’ = (keru)og uniquely. 


The kernel of u is determined only up to an isomorphism applied to K; that is, 7 
is determined only up to right multiplication by an isomorphism. The condition 
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for (K, i) to be a kernel is equivalent to the exactness of the sequence of abelian 
groups 


0 ——> Hom(K’, K) > Hom(K’, A) 2 Hom(K’, B). 


In fact, ui = O makes the sequence a complex, the existence of g produces exact- 
ness at Hom(K’, A), and the uniqueness of g produces exactness at Hom(K’, K). 

Similarly the cokernel of uv, denoted by coker, is a pair (C, p) with p in 
Hom(B, C) such that the composition A —“. Bs Chas pu = 0 and such 
that for any pair (C’, p’) with p’ in Hom(B, C’) for which p’u = 0, there exists 
a unique w € Hom(C,C’) with wp = p’. See Figure 4.7. It is customary to 
drop all mention of the object C in the definition of cokernel, saying that the 
cokernel is p, since any mention of p carries along C as the codomain of p; we 
shall adopt this abbreviated terminology shortly but shall refer to the pair (C, p) 
as the cokernel for the time being. 


eae as Be A 
vo Le 
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FIGURE 4.7. Universal mapping property of a cokernel (C, p) of u. 


The brief form of the definition of cokernel is that (coker uw) o u = O and 
pu=0 implies p' =wo (cokeru) uniquely. 
The cokernel of u is determined only up to an isomorphism applied to C; that is, p 
is determined only up to left multiplication by an isomorphism. The condition for 
(C, p) to be a cokernel is equivalent to the exactness of the sequence of abelian 
groups 


0 ——} Hom(C,C’) eR Hom(B, C’) Be Hom(A, C’). 


In fact, pu = 0 makes the sequence a complex, the existence of y produces exact- 
ness at Hom(B, C’), and the uniqueness of y produces exactness at Hom(C, C’). 


Proposition 4.33. Let C be an additive category. If an element u of Hom(A, B) 
has a kernel (K,i) and if m € Hom(B, B’) is a monomorphism, then (K, i) is 
also a kernel of mu. If u has a cokernel (C, p) and if e € Hom(A’, A) is an 
epimorphism, then (C, p) is also a cokernel of we. Briefly 

ker(mu) = keru and coker(ue) = coker u. 

REMARK. We can safely omit the proof of any dual statement about addi- 

tive categories, since the dual follows by expressing the original argument as a 


diagram, reversing all the arrows, and writing down the argument that the new 
diagram represents. 


8. Abelian Categories 237 


PROOF. We test whether i = keru is a kernel of mu. We know that (mu)i = 
m(ui) = 0. Suppose that mui’ = 0 with i’ € Morph(k’, A). Since m is a 
monomorphism, ui’ = 0. Because i is a kernel of u, we obtain i’ = ig fora 
unique g € Morph(K’, K). Hence i is a kernel of mu. The statement about 
cokernels is dual. 


Proposition 4.34. Let C be an additive category. If an element u of Hom(A, B) 
has a kernel (K, i), theni is a monomorphism. Dually if u has a cokernel (C, p), 
then p is an epimorphism. 


PROOF. Suppose that u has a kernel (K,i). For any object K’, the zero 
morphism i’ = 0 of Hom(K’, A) has the property that ui’ = 0. The uniqueness 
property of the kernel says that the g in Hom(K’, K) with ig = i’ is unique. 
Evidently g = 0 is one such choice and hence is the only such choice. Thus if f 
in Hom(K’, K) has if = 0,then f = 0. Therefore i is a monomorphism. 


Propositions 4.33 and 4.34 give a first hint that the notation (K, i) forthe kernel, 
which we know is redundant, may also be inconvenient; it would be far simpler 
to refer to the kernel as i, and analogously for cokernels. Then Proposition 4.33 
could truly be stated as the displayed formulas in its statement, and Proposition 
4.34 would have the tidier statement that every kernel is a monomorphism and 
every cokernel is an epimorphism. Let us therefore now allow ourselves to regard 
kernels and cokernels as morphisms, rather than pairs consisting of an object and 
a morphism. With this convention in place, we always have u o (keru) = 0 and 
(cokeru) ou =0. 


Proposition 4.35. Let C be an additive category, and let u be in Hom(A, B). If 
u has a kernel and ker u has a cokernel, then coker(ker w) is a kernel of u. Briefly 


ker(coker(ker u)) = keru. 
Dually if u has a cokernel and coker u has a kernel, then 
coker(ker(coker u)) = cokeru. 


PROOF. Let (K, i) be akernel of u, and let (C, p) be a cokernel of i. We are to 
show that i is a kernel of p. For the existence step, suppose that i’ in Hom(K’, A) 
has pi’ = 0. We are to show that i’ factors as i’ = ig for some unique ¢ in 
Hom(K’, K). We know that ui = 0. Since p = cokeri, u factors as u = wp for 
some y in Hom(C, B). Then ui’ = (Wp)i’ = w(pi’) = 0. Since i = keru, i’ 
factors as i’ = ig as required. This proves existence of ¢. 

For the uniqueness step, suppose that pi’ = 0 for some i’ in some Hom(K’, A). 
If i’ were to have two distinct factorizations, say as i’ = ig = ig, then i could 
not be a monomorphism, in contradiction to Proposition 4.34 and the fact that 
i = keruw. This proves uniqueness of g. 
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An abelian category C is an additive category with the following two proper- 
ties: 


(iv) every morphism has a kernel and a cokernel, 
(v) every monomorphism is a kernel, and every epimorphism is a cokernel. 


It is evident that the opposite category of any abelian category is abelian. Thus 
we can continue to use duality arguments. 

Property (iv) is certainly desirable if one wants to have a theory involving ho- 
mology and cohomology. Property (v) may be viewed as a converse to Proposition 
4.34; some other authors use a different but equivalent formulation of this axiom. 
The objective is to have a generalization of the kind of factorization that one has 
with homomorphisms of abelian groups: any homomorphism factors canonically 
as the product of the canonical passage to the quotient by the kernel, followed by 
an isomorphism of this quotient onto the image of the homomorphism, followed 
by the inclusion of the image into the range. 


Proposition 4.36. In any abelian category, every morphism that is both a 
monomorphism and an epimorphism is an isomorphism. 


ProoF. If # € Hom(K, A) is a monomorphism, then f = ker g for some g 
in some Hom(A, B) by (v). This fact implies that gf = go (kerg) = 0. If f 
is also an epimorphism, then the equality gf = O implies that g = 0. Hence 
f =kerO,4g. Taking K’ = A and i’ = 1, in Figure 4.6, we have Oi’ = 0 and 
thus have 14 = fg for some g in Hom(A, K). Thus the monomorphism f has 
a right inverse and must be an isomorphism. 


Lemma 4.37. In an abelian category C, every monomorphism is the kernel of 
its cokernel, and every epimorphism is the cokernel of its kernel. 


PROOF. Ifm is a monomorphism, then (v) says that m = ker u for some u. Sub- 
stituting into the first conclusion of Proposition 4.35, we obtain ker(cokerm) = 
m. If eis an epimorphism, then (v) says that e = coker u for some u. Substituting 
into the second conclusion of Proposition 4.35, we obtain coker(kere) = e. 


Proposition 4.38. In an abelian category C, any morphism f factors as f = me 
for a monomorphism m and an epimorphism e. Here one such factorization is 
given by 

m = ker(coker f) and e = coker(ker f). 


Any other such factorization f = m’e’ has the property that there is some 
isomorphism x with e’ = xe and m'x =m. 
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PROOF. Put m = ker(coker f). Since (coker f) f = 0, the brief form of the 
definition of kernel gives f = me for some e. We are going to prove that e is an 
epimorphism. Thus suppose that re = 0 for some morphism r. The brief form 
of the definition of kernel shows that e = (kerr)e’ for some morphism e’. Then 
we have 


f =me=m<ikerr)e! =m'e', where m’ = mkerr. 


Being a kernel, kerr is a monomorphism. As the composition of two monomor- 
phisms, m’ is a monomorphism. Lemma 4.37 shows that m’ = ker p’, where 
p’ =cokerm’. 

Put p = cokerm. The definition of m and the second identity of Proposition 
4.35 gives p = coker(ker(coker f)) = coker f. Since m’ = ker p’, we have 
p’m' = 0. Hence p’ f = p'm’e' = 0. Since p = coker f, the brief form of the 
definition of cokernel shows that p’ = sp for some s. Thus p’m = spm = 0, the 
latter equality holding because p = cokerm. Since m’ = ker p’, the brief form 
of the definition of kernel gives m = m't for some f. 

Resubstituting for m’ gives m = m’t = m(kerr)t. Since m is a monomor- 
phism, we can cancel and obtain 1 y = (kerr)t, where X is the codomain of kerr. 
In other words, kerr has a right inverse. Being a monomorphism, it must be an 
isomorphism. Since any morphism v has v ker v = 0, we obtain r kerr = 0 and 
conclude that r = 0. Therefore e is an epimorphism, as asserted. 

Since e is an epimorphism, Lemma 4.37 gives e = coker(ker e), and Propo- 
sition 4.33 gives kere = ker(me) = ker f. Therefore e = coker(ker f). This 
completes the proof of existence of the decomposition. 

For uniqueness, suppose that f = m’e’ for a monomorphism m’ and an 
epimorphism e’. Proposition 4.33 gives ker f = ker(m’e’) = kere’, as well 
as ker f = ker(me) = kere, the understanding being that these equalities hold 
up to an isomorphism on the right. Set u = kere and wu’ = kere’; then u = u’w 
for some isomorphism w. Since e and e’ are epimorphisms, Lemma 4.37 gives 
e = cokeru and e’ = cokeru’. Since m’ is a monomorphism, the equality 
0 = f(ker f) = fu = m’e‘u implies that e’u = 0; by the brief form of the 
definition of coker u as a cokernel, e’ factors as e’ = xe for a unique x. Similarly 
the equality 0 = fker f = fu’ = meu implies that eu = 0; by the brief form of 
the definition of coker u’ as a cokernel, e factors as e = x’e’ for a unique x’. Then 
e = x’e' = x’xe; since e is an epimorphism, x’x is the identity on its domain. 
Similarly e’ = xe = xx’e’, and it follows that xx’ is the identity on its domain. 
Consequently x is an isomorphism. Multiplying e’ = xe by m’ on the left gives 
me = f = m'e' = m’xe; since e is an epimorphism, m = m’x. This completes 
the proof. 


With this canonical factorization in hand, we introduce two terms that will 
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simplify the definition of “exact sequence.” We define the image and coimage 
of f = me in Hom(A, B) by 


m = image f and e = coimage f. 


In words, the image of any morphism is its monomorphism factor, and the coimage 
is its epimorphism factor; in particular, a monomorphism is its own image, and 
an epimorphism is its own coimage.!” Let us see what the factorization and these 
formulas say in terms of diagrams. We write (K, 7) for the kernel of f and (C, p) 
for the cokernel of f. Let J be the codomain of e, which equals the domain of 
m. In terms of a diagram, the situation for f is then given by 

i=kere e=cokeri =F m =ker p . B p=cokerm 


K > A 


=ker f =coimage f = image f =coker f 


The top row of labels explains the relationships among i, e, m, p, and the bottom 
row of labels relates i, e, m, p to f. The morphism f itself is the composition of 
the two morphisms in the center. 

In a good category of modules, we can interpret this diagram in terms of the 
two short exact sequences 


0 —— K a fee > A/imagei ——~> 0, 


0 ——> A/imagei —~> B —“> C ==), 


which we can merge into a single 6-term exact sequence 


i me=f 


0 > K > A : 


> B >C > 0. 


Now we can define complexes and exact sequences for abelian categories, and 
we can readily check that the new definitions are consistent with the definitions 
for good categories of modules. A chain complex is a doubly infinite sequence of 
morphisms with decreasing indexing such that the consecutive compositions are 
defined and are 0. If f € Hom(A, B) and g € Hom(B, C) are given morphisms, 


then the sequence 


eee eee ee 


is exact at B if image f = ker g, or equivalently if coker f = coimageg. As 
usual in the subject of abelian categories, the equality sign here means “can be 
taken as.” In more detail if f and g decompose as f = me and g = m’e’, image f 
is defined to be m, and ker g equals kere’. Thus the condition for exactness is 


'’The term “coimage” is not really needed for recognizing exact sequences, but it makes any 
implementation of duality more symmetric. 
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that m be a kernel of e’. Since u(keru) = 0 for any morphism u, exactness at 
B implies that e’m = 0. Then gf = m’e'me = 0, and we see that the given 
sequence (when extended by 0’s at each end) is a complex. 

Exactness of any finite or infinite sequence of morphisms whose consecutive 
compositions are defined means exactness at every object X in the sequence 
for which there is an incoming morphism in some Hom(W, X) and there is an 
outgoing morphism in some Hom(X, Y). With the kind of indexing used for a 
chain complex, a sequence 


Mnen Mn-1en-1 


a a >: Xi. Ss 


> Xh 


is exact if m, = kere,_,, or equivalently if e,_; = cokermy, for all n. 
For a sequence of four morphisms of the form 


0 > K >A > C > 0, 


exactness means exactness at K, A, and C. The conditions are that m is a 
monomorphism, e is an epimorphism, and m = kere (or equivalently that e = 
coker m). In this case the sequence is called a short exact sequence. 

One can now proceed to define projectives and injectives for any abelian 
category as certain objects in the same way as in Figures 4.3 and 4.4, and extend 
all the results of earlier sections of this chapter to all abelian categories. We shall 
not carry out this detail.!® 

Instead, we shall indicate an approach to carrying out this detail that takes most 
of the difficulty out of translating results from the context of good categories to 
the context of abelian categories. It is to use the notion of “members.” The 
word “members” in the present setting refers to something that substitutes for 
elements in situations in which objects need not necessarily be sets of elements. 
The idea is to recast elements, when they exist, in terms of morphisms and then 
to generalize the resulting definition. For orientation, consider the category Cr 
of all unital left R modules, R being a ring with identity. Let us write Ro for 
the left R module R. The elements of a unital left R module X are then in 
one-one correspondence with the R homomorphisms of Ro into X, the element 
x corresponding to the homomorphism that carries r to rx. Thus the category 
Cr has a distinguished object Ro such that the elements of any object X are in 
one-one correspondence with Hom(Ro, X). Hence any argument about elements 
for this category immediately translates into an argument about morphisms. 

The trouble is that a general abelian category has no distinguished object to play 
the role of Ro. The idea for getting around this difficulty is to take all possible 


'8The entire theory for abelian categories is carried out in detail in Freyd’s book Abelian Cate- 
gories: An Introduction to the Theory of Functors. 
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objects Xo in place of Ro, consider the union on Xo of all sets Hom(X0, X), 
introduce an equivalence relation, and hope for the best. 

The definition is as follows. Let C be an abelian category, fix X in Obj(C), 
and consider all morphisms with codomain X. Two such morphisms x and y are 
said to be equivalent morphisms for current purposes, written x = y, if there 
exist epimorphisms u and v such that xu = yv. It is evident that “equivalent” 
is reflexive and symmetric. Transitivity requires proof, and we return to this 
matter in a moment. Once = has been shown to be an equivalence relation, an 
equivalence class of such morphisms is called a member of X. We write x €, X 
to indicate that x is a morphism with codomain X, hence to indicate that x is a 
morphism whose equivalence class is a member of X. To avoid clumsy wording 
when there is really no possibility of confusion, we often simply say that x is 
a member of X. The question arises whether this definition presents any set- 
theoretic difficulties. As usual in category theory, one can answer the question 
painlessly by working when necessary only with subcategories for which the 
objects actually form a set; in this case, the union over all objects X and Y in the 
subcategory of all the groups Hom(X, Y) of morphisms is a set, and there is no 
problem. Let us return to a special case of our example. 


EXAMPLE OF MEMBERS. Let C = Cz be the category of all abelian groups, and 
fix an abelian group X. If x is an abelian-group homomorphism with codomain 
X, let us use Proposition 4.38 to write x = me for a monomorphism m and 
an epimorphism e. Then x = m, and thus we might just as well consider only 
one-one homomorphisms into X. If H is the image of x, then we can view 
xX as a composition x = iy of a homomorphism y carrying the domain of x 
onto H, followed by the inclusion iq : H — X. The homomorphism y is an 
isomorphism, hence is an epimorphism. Thus x = iy. It is apparent that no 
two inclusions of subgroups of X into X are equivalent morphisms. Since every 
inclusion of a subgroup of X into X yields a member of X, the members of 
X are exactly the subgroups of X. Thus for example the set of members of Z 
corresponds to the set of integers > 0, in which addition is lost, and does not 
correspond exactly to the set of elements of Z. This fact is a little discouraging, 
but it turns out not to be as bad an omen as one might expect. 


Returning to the setting of a general abelian category, we work toward a proof 
that = is an equivalence relation. We need the notion of the “pullback” of two 
morphisms, which we define by a universal mapping property momentarily. The 
appropriate construction establishing existence appears in the next proposition. 
Then we prove a proposition for using pullback as a tool, and afterward we prove 
the transitivity. 

In an abelian category C, let X,Y, Z be objects, and let f € Hom(Y, Z) 
and g € Hom(X, Z) be morphisms. A pullback of the pair (/, g) is a triple 
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(W, f g) in which W is an object in C, in which f and g are morphisms with 
f € Hom(W, Y) and g € Hom(W, X ), and in which the following universal 
mapping property holds: whenever (W’, f’, 2’) is atriple such that W’ is an object 
in C and f’ and g’ are morphisms with f’ € Hom(W’, Y) and g’ € Hom(W’, X) 
and with fg’ = gf’, then there exists a unique ¢ ¢ Hom(W’, W) such that 
f' = fev and 2 = g¢@. See Figure 4.8. 


FIGURE 4.8. The pullback of a pair (f, g) of morphisms. 


Proposition 4.39. In an abelian category C, let X, Y, Z be objects, and let 
f € Hom(X, Z) and g € Hom(Y, Z) be morphisms. Let X @ Y be the direct 
sum, let px and py be the projections on the two factors, define h = fpx — gpy 
in Hom(X @ Y, Z), and let m = kerh. Then a pullback (W, f. 2) of (f, g) is 
given by W = domainm, f = pym,and g = pym. 


REMARKS. The dual statement asserts the existence of a pushout of a pair 
of morphisms, and it is a consequence of Proposition 4.39. Problem 35 at the 
end of the chapter points out that the proof of Proposition 4.19a made use of a 
concretely constructed pullback, while the proof of Proposition 4.19b made use 
of a concretely constructed pushout. 


~ 


PROOF. From hm = hkerh = 0, we obtain0 = fpxm — gpym = fg — gf, 
and thus fg = gf. Now suppose that W’, f’, and g’ are given with fg’ = 
gf’. Then m’ = (g’, f’) is a morphism in Hom(W’, X @ Y) such that hm’ = 
fpxm' — gpym' = f2 —gf' =0. Therefore m’ factors through m = ker h as 
(g’, f’) = mg for a unique g € Hom(W’, W). Application of px and py to this 
equality gives @” = pymy = gy and f’ = pymy = f¢@. 


Proposition 4.40. In the notation of Figure 4.8 and Proposition 4.39 if f is a 
monomorphism, then so is f. If f is an epimorphism, then so is f; in the case 
of an epimorphism, ker f factors as ker f = @(ker f ). 


PROOF. Throughout the proof let 7x and iy be the injections associated with the 
direct sum X @ Y. Suppose that / is amonomorphism, and suppose that fi w=0 
for some morphism with codomain W. Since f = pym, pymw = 0. Then 
0 = (fpx — gpy)mw =_fpxmw —0 = fpxmw. Since f is a monomorphism, 
pxmw = 0. Sincealso fw = pymw =0,mw = (ixpxtiy py)mw = 0. Butm 
is amonomorphism, and therefore w = 0. Consequently f is a monomorphism. 

For the remainder of the proof, assume that f is an epimorphism. Let us 
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see that h = fpx — gpy is an epimorphism. In fact, if zh = 0, then 0 = 
z(fpx — gpy)ix = zfpxix =zf. Since f is an epimorphism, z = 0. Thus / is 
an epimorphism. 

__ It follows from Lemma 4.37 that h = coker(kerh) = cokerm. To prove that 
f is an epimorphism, suppose that vf = 0 for some morphism v with domain 
Y. This means that vpym = 0. Since fA is the cokernel of m, vpy factors as 
vpy = vh for some morphism v’. Applying ix on the right end of both sides 
gives 0 = vpyix = vhiy = v'(fpx — gpy)ix = v' fpxix = v'f. Since f is 
an epimorphism, v’ = 0. Hence vpy = v'h = 0. Since py is an epimorphism, 
v = 0. Therefore f is an epimorphism. 

Now set k = ker f , and let K be its domain. The morphisms k € Hom(K, X) 
and 0 € Hom(K, Y) have fk = 0 = g0. If we set W’ = K, f’ =0,and g’ =k, 
then fg’ = g f’, and Proposition 4.39 produces a unique y in Hom(K, W) with 
O= Zt y and k = gy. We shall show that ¢ is a kernel of f, and then the equation 
k = gy completes the proof. 

We know that fi gy = 0. Thus suppose that fi v = 0 for some morphism v in 
some Hom(K’, W). Since fg = gf, we have fgv = gfv= = 0. Thus @v factors 
through k = ker f as gv = kv’ for some v’ in Hom(K’, K). 

Put 9 = v— gv’. Then f® = fu — fov’ =0-0 =0, and g@ = 
gv — ggv' = kv’ — kv’ = 0. Consequently if we put W” = K’, f'= = 0, and 
g= 0 then ® and 0 are two morphisms in Hom(K', W) with fl= = fo= f0 
and g” = g@ = ft 0. By uniqueness of the morphism in the universal mapping 
property for pullbacks, ® = 0. Therefore v = gv’, and v has been exhibited as 
factoring through 9. 

If v factors through " also as v = gu”, then 0 = g(v' — v”), and we eae 
k(v' — v”) = gy(v' — v"”) = 0. Since k = ker f is a monomorphism, v! = v" 
Thus the factorization of v through @ is unique, and ¢ is a kernel of 2 This 
completes the proof. 


Proposition 4.41. Let C be an abelian category, let X be an object in C, 
and define x = y for two morphisms x and y with codomain X if there exist 
epimorphisms uw and v with xu = yv. Then the relation = on the morphisms 
with codomain X is transitive and hence is an equivalence relation. 


REMARK. A nontrivial special case is that the obvious equivalences xu = x 
and x = xv imply the nonobvious equivalence xu = xv when u and v are 
epimorphisms. 


PROOF. Assuming that x = y and y = Z, write xu = yv and yr = zs 
for epimorphisms u,v,7,s5. Since v and r have the same codomain, namely 
domain(y), the pullback (0,7) of (v,r) as in Proposition 4.39 is well defined, 
and Proposition 4.40 shows that 0 and F are epimorphisms. Since rv = vr, we 
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obtain xur = yur = yrv = zsv. The morphisms uF and sv are epimorphisms 
as compositions of epimorphisms, and therefore x = z. 


Fix an object X. Then 00x is a member of X called the zero member, denoted 
by 0. Every zero morphism Oyx with codomain X is equivalent to Oox; in fact, 
Oyx = 00x0yo. The morphism Oyo is an epimorphism because if f ¢ Hom(0, Z) 
has fOyo = Oyz, then f is the unique element 09z of Hom(0, Z). Conversely 
any nonzero morphismr in Hom(Y, X) is inequivalent to Oy x. In fact, an equality 
ru = Oyyv for epimorphisms u and v would imply that r = Oyx, since we can 
cancel in the equality ru = Oyyv = Oyyu. 

Each x €,, X has a “negative; namely the class of the negative of the repre- 
sentative x of the member; i.e., taking the negative of a morphism is respected 
in passing to classes. We write —x ©, X for the negative. (Warning: As 
the example with the category of abelian groups shows, one should use care in 
inferring any relationship between “negatives” and zero members.) 

If f is a morphism in Hom(X, Y), then each member x €,, X yields by 
composition a well-defined member fx €,, Y. To see that this notion is indeed 
well defined, suppose that x = x’, and choose epimorphisms u and v with 
xu = x'v. Then (fx)u = f(xu) = f (x’v) = (fx')v shows that fx = fx’. 

The main result is Theorem 4.42 below, which gives a calculus for diagram 
chases using members in general abelian categories. After the proof we shall be 
content with one example of how the theorem allows all the diagram chases in 
earlier sections of this chapter to be extended to general abelian categories. The 
example is the proof of the part of the Snake Lemma that involves an explicit 
construction.!? More examples appear in Problems 34—35 at the end of the 
chapter. 


Theorem 4.42. The members of an abelian category satisfy the following 
properties: 
(a) a morphism f € Hom(X, Y) is a monomorphism if and only if every 
X Em X with fx =Ohas x =0, 
(b) amorphism f ¢ Hom(X, Y) is a monomorphism if and only if every pair 
of members x €,, X and x’ €,, X with fx = fx' has x =x’, 
(c) a morphism g € Hom(X, Y) is an epimorphism if and only if for each 
y Em Y, there exists some x €,, X with gx = y, 
(d) a morphism h € Hom(X, Y) is the 0 morphism if and only if every 
X Em X hashx =0, 
(e) a sequence X hy Bir exact at Y if and only if gf = 0 and also 
each y €,, Y with gy = 0 has some x €,, X with fx = y, 


'°For more detail about this example and for further examples, see Mac Lane’s Categories for 
the Working Mathematician. 
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(f) whenever x, y, z are members of an object X and x = yu + zv for some 
epimorphisms u and v, then xu’ — yv’ = z for some epimorphisms wu’ 
and v’. 


REMARKS. 

(1) The interpretations of (a) through (e) are straightforward enough and 
already give an indication that the notion of a member may be of some help 
in translating proofs for good categories into proofs for abelian categories. Ap- 
plication of (d) to the difference f; — f2 of two morphisms in Hom(X, Y) shows 
that f\x = fox for all x €,, X implies f; = fo. 

(2) The interpretation of (f) is more subtle. As the example with the Snake 
Lemma below will show, conclusion (f) makes it possible to mirror in the theory 
of members the kind of subtraction that takes place with elements of a module to 
get their difference to be in the kernel of some homomorphism. 


PROOF. For (a) and (b), if f is a monomorphism and fx = fx’, then fxu = 
fx’'v for suitable epimorphisms u and v, and cancellation yields xu = x'v and 
hence x = x’. Conversely suppose fx = 0 only for x =0. If f has fx’ = Oy 
for some x’ in some Hom(A, X), then fx’ = 0 and so x’ = 0 by hypothesis. In 
this case, x’ = 04x because we know that nonzero morphisms are not equivalent 
to 0. 

For (c), suppose that g is an epimorphism. If y ¢€,, Y is given, let y be 
in Hom(X’, Y), and let (g, y) be the pullback of (g, y), satisfying yg = By. 
Proposition 4.40 shows that @ is an epimorphism, and then y = gx for x = Y. 
Conversely if g fails to be an epimorphism, then there exists r ~ 0 in some 
Hom(Y, Z) withrg = Oyz. Ifthere is some x insome Hom(A, X) with gx = ly, 
we can compose with r on the left of both sides and obtainrgx =rly =r. Since 
the left side equals 047, which is equivalent to Oyz, we obtain 0yz = O4z =r, 
which we know not to be true for nonzero members r of Hom(Y, Z). 

For (d), if h = Oxy and if x isin Hom(Z, X), then hx = Oyyx =Ozy = Opy. 
Conversely if every x in every Hom(Z, X) has hx = Ooy, we take Z = X and 
x = ly. Then hu = hxu = Ooyv for some epimorphisms u € Hom(A, X) and 
v € Hom(A, 0). This says that hu = Oay = Oxyu. Since u is an epimorphism, 
h = Oxy. 

For (e), let f = me be the decomposition of f as in Proposition 4.38. Then 
m = image f, and we define k = kerg. If the sequence is exact at Y, then 
gf = as part of the definition. Suppose y €,, Y has gy = 0,1.., gy = 0. Since 
m = ker g by exactness, the equality gy = 0 and the definition of kernel together 
imply that y = my’ for some y’. Using Proposition 4.39, let (e, y’) have (é, y’) 
as pullback, satisfying ey’ = y’é. Since e by construction is an epimorphism, 
Proposition 4.40 shows that @ is an epimorphism. From the computation fy’ = 
mey' = my’e = ye, we obtain fy’ = y. Then x = y’ has x €,, X and fx = y. 
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Conversely suppose that gf = 0 and that the other condition holds. Since e 
is an epimorphism, the equality gf = O implies that gm = 0. The definition of 
k = kerg thus gives m = kg for some morphism g. Meanwhile, the morphism 
k = kerg hask €, Y and gk = 0. Thus gk = 0. The hypothesis produces 
x €, X with fx = k,ie., with mexu = kv for suitable epimorphisms u and 
v. Write ex = m’e’ according to Proposition 4.38. Then mm'e'u = kv, and the 
uniqueness in Proposition 4.38 shows that k = mm’ for some isomorphism y. 
Putting the results together gives m = ky = mm'Wq andk = mm'y = kom'y. 
Since m and k are monomorphisms, | = m'yy and 1 = gm'wW. These show that 
¢g has a left inverse and a right inverse, hence is an isomorphism. Then m’ too is 
an isomorphism, and k = m except for a factor of an isomorphism on the right 
side. This means that we can take ker g = image f and that the given sequence 
is exact at Y. 

For (f), letx = yu+zv. Then xu, = (yu+zv)vj, and xu, — y(uv,) = Zvv}. 
Consequently xu, — y(uv}) = zvv,; = z, and (f) follows with u’ = u, and 
v =uvy. 


Theorem 4.42 enables us to use members to verify properties of morphisms in 
diagrams, but it does not by itself construct any morphisms. That is, just because 
we know what the equivalence class of fx should be for every x €,, X does not 
mean that we have a construction of f; it means only that we know how to work 
with f once f is known to exist. Specifically we know from Remark 1 with 
the theorem that there cannot be a different morphism g with fx = gx for all 
x €,, X. Some tools that we have for constructing morphisms for a general abelian 
category are the existence of kernels and cokernels via Axiom (iv), Proposition 
4.39 asserting the existence of pullbacks of pairs of morphisms, and the dual of 
Proposition 4.39 asserting the existence of pushouts of pairs of morphisms. For 
particular categories of interest, the hypotheses “enough projectives” and “enough 
injectives” provide additional constructions of morphisms. 

The most complicated example of a constructed mapping that we encountered 
in the theory for good categories was the connecting homomorphism in the Snake 
Lemma. In the generalization to abelian categories, the construction of the 
connecting morphism has to go outside the usual diagram given in Figure 4.2. 
Problem 33 at the end of the chapter will compare the actual construction and 
Figure 4.2 for the chain map of exact sequences of abelian groups given below 
and observe that the two diagrams are different: 


0 ae x8 “7 peplmods Z/8Z 0 
| a | > | 1mod8 
UE Ae lhe mmada 


0 gh nig NS 7 A eG 
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The domain of the connecting homomorphism for this situation is the set of even 
members of Z/8Z, and the mapping carries 2 + 8Z to 1 + 4Z in Z/4Z. 


EXAMPLE OF DIAGRAM CHASE. In the setting of the Snake Lemma (Lemma 
4.6), we shall construct the connecting morphism w and verify that its value on 
each member of its domain corresponds to what we expect on the basis of Lemma 
4.6. The given snake diagram, partially enlarged toward Figure 4.2, is 


Ae ees > 0 

Be ed ) 
jC ==— ae ao 

| 

Ao 


with the rows exact and the squares commuting. The added parts at the top 
and bottom are the kernel (Co, k) of y and the cokernel (AQ, p) of a. Once 
the connecting homomorphism has been constructed, the proof of exactness will 
involve a diagram chase that makes rather straightforward use of Theorem 4.42, 
including conclusion (f). By contrast, the initial construction will involve a 
different sort of diagram, namely 


Bo ------ > Co 
a re |: 
- Vv 
Ghskcess a eee ay eee 6 > 0 


Sp ES GAG 
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In the construction we adjust the first row of (*) to make it exact when a 
0 is included at the left end. To do so, we factor g according to Proposition 
438 as y = me, we let A = domainm = codomaine, and we write @ for 
m. The commutativity of the left square of (*) implies that g’a(kerg) = 
Be (ker gy) = 0. Since g’ is a monomorphism, a(kerg) = 0. Then the fact 
e = coker(ker g) implies that a factors through e as a = @e for some @ with 
domain A. Consequently the left square in the adjusted diagram commutes, and 
the first row is exact with the 0 inserted at the left. Since e is an epimorphism, 
p = coker a = coker(@we) = coker @, and the vertical line at the left is exact. 

By a dual argument starting from a factorization of w’, we can replace the 
triple (C’, w’, y) in similar fashion by (on Vv, Y), see that k = ker y, and add 
a 0 at the end of the second row to obtain an exact sequence. 

Next, let (Bo, Vv, k) be a pullback of (y, k). Proposition 4.40 shows that v 
is an epimorphism and that ker wy = k ker Vv. Since the first row is a short exact 
sequence, we know that @ = ker y, and the condition ker y = k ker v shows 
that @ = ker w satisfies @ = k@. This completes the dashed arrows in the top 
part of the diagram. By a dual argument using p = cokera@, we complete the 
dashed arrows in the bottom part of the diagram, deducing from v= = coker g’ 
the fact that y= = coker @’ satisfies v= a De 


Lemma 4.37 shows from @ g= ker that v= = coker @, and it shows from 
y= = coker ¢’ that g’ = ker ae With these formulas in hand, we can construct 
the connecting Soh oR ee Define wy) = PBK i in Hom(Bo, Bj) to be the 
composition down the center. Then ag@ = Pp DBK@ = = Q pa = 0, the last 
equality holding because pa = 0. Therefore wo factors through v= = coker @ as 
oo = ow for “Some | € Hom(Cp, Bj). The morphism @, satisfies wv oy = = 
wv pBk = = yky = = 0, the last equality holding because yk = 0. Since v is 
an epimorphism, we can cancel it, obtaining yw’; = 0. Therefore @ factors 
through g’ = ker y’ as w; = ¢’w for some morphism w € Hom(Co, Ao). 

The construction of @ is now complete, and the assertion is that the value of w 
on members corresponds to what we expect from the proof of Lemma 4.6. Since 
equivalences wx = w’x for some other candidate w’ for the connecting morphism 
and for all x €,, Co would imply that @ = w’, the argument will show that we 
have found the unique morphism taking the prescribed values on members. 


During the verification we refer to (*) to do the diagram chase. The member of 
C corresponding to x Em Co is kx Em C. Since w is an epimorphism, Theorem 
4.A2c produces b €,, B with wb = kx. Then 'Bb = yb = ykx = 0, since 
yk = 0. Theorem 4.42e and exactness at B’ imply that g’a’ = Bb for some 
a’ € A’, and the class of a’ is unique (for the b under consideration) by Theorem 
4.42b because g’ is a monomorphism. We shall verify that wx = pa’, and then 
the class of wx matches what we expect from the proof of Lemma 4.6. 
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First let us show that a different choice of b, say b,, leads to the same class 
pa’. We are given that yb = yb. Let a’ and aj be the corresponding members 
of A’ with g’a’ = Bb and g‘a, = Bb,. We shall make repeated use of Theorem 
4 42f, letting subscripted w’s and v’s denote suitable epimorphisms. From wb = 
wb,, Theorem 4.42f gives ybhu; — whiv,; = 0,1¢., W(bu, — bv) = 0. By 
Theorem 4.42e and exactness at B, bu; — b,v; = ya for some a €,, A. Hence 
Bbu, — Bb\v,; = Bya = ¢'aa. Two applications of Theorem 4.42f starting from 
Bbu, — Bbiv, = ¢g’aa give 


g'a’ = Bb = ¢’aau2 + Bbir2, 
and then g'a'u; — y'aav; = Bb, = ¢’a}. 
Since g’ is a monomorphism, Theorem 4.42b says that 
a’'u3 — aav3 = ai. 
Applying p, we obtain pa’u3 — paav3 = pa). Since pa = 0, we can drop the 
term pav3, and we conclude that pa’ = pa'u3 = pay. 

We can now return to the verification that ax = pa’, making use of the adjusted 
diagram as necessary. 20 Since v is an epimorphism, Theorem 4.42c produces 
bo Em Bo with Who =x. Thenkbo €,, B has wkbo = kipby = = kx. Hence kbo 
is a member of B like b and b, in the previous Batteraph The above argument 


shows that Bkbo Em B' has Bkby = = qa’ for some a’ €,, A’ and that pa’ Em Aj 
is what we should hope for as the value of wx. So we compute that 


P'ox = wx = obo = wobo = Ppkbo = py'a’ =F pa’. 


Since g’ is a monomorphism by the dual of Proposition 4.40, Theorem 4.42b 
shows that wx = g’a’, which is the formula we were seeking. 


9. Problems 


1. (a) Prove that the good category of all finitely generated abelian groups has 
enough projectives but not enough injectives. 
(b) Prove that the good category of all torsion abelian groups has enough injec- 
tives but not enough projectives. 

2. LetCz be the category of all abelian groups. Give an example of a nonzero good 
category C of abelian groups that has enough projectives and enough injectives 
but for which no nonzero projective for Cz lies in C and no nonzero injective for 
C lies in Cz. 


20 Warning: The construction of w involves Bo and Bj, which are in the adjusted diagram but 
are not in (*). These objects do not necessarily coincide with the domain of ker 6 and the codomain 
of coker f. 
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Let R be a semisimple ring in the sense of Chapter II, and let Cr be the category 
of all unital left R modules. Prove that every module in Cr is projective and 
injective. 


Let R be a (commutative) principal ideal domain, and let Cr be the category of 

all unital R modules. A module M in Cz is divisible if for each a 4 0 in R and 

x € M, there exists y € M withay = x. 

(a) Referring to Example 2 of injectives in Section 4, prove that injective for Cr 
implies divisible. 

(b) Deduce from Proposition 4.15 that divisible implies injective for Cr. 


Let R be a (commutative) principal ideal domain, and let Cr be the category of all 
unital R modules. Prove that every module M in Cg has an injective resolution 
of the form 0 — M > Ip > I, > 0 with Jp and J; injective. 


Let C, C’, C” be good categories of modules with enough projectives and enough 

injectives, let G : C > C’ bea one-sided exact functor with derived functors G, 

or G", and let F : C’ + C” be an exact functor. 

(a) Prove that if F is covariant, then F o G is one-sided exact, and its derived 
functors satisfy (F o G), = F o G, or (Fo G)” = F 0G". 

(b) Prove that if F is contravariant, then F oG is one-sided exact, and its derived 
functors satisfy (F o G)” = F 0G, or (Fo G)” = F 0G". 


Let C, C’, C” be good categories of modules with enough projectives and enough 
injectives, let F : C — C’ be an exact functor, and let G : C’ > C” bea 
one-sided exact functor with derived functors G, or G”. 

(a) Suppose that F’ is covariant, that G, or G” is defined from projective res- 
olutions, and that F carries projectives to projectives. Prove that G o F is 
one-sided exact and that its derived functors satisfy (G o F), = G, o F or 
(GoF)"=G"oF. 

(b) Suppose that F’ is covariant, that G, or G” is defined from injective res- 
olutions, and that F carries injectives to injectives. Prove that G o F is 
one-sided exact and that its derived functors satisfy (G o F), = G, o F or 
(GoF)"=G" oF. 

(c) Suppose that F is contravariant, that G, or G” is defined from projective 
resolutions, and that F' carries injectives to projectives. Prove that G o F is 
one-sided exact and that its derived functors satisfy (G o F)" = G" o F or 
(GoF), =G,oF. 

(d) Suppose that F is contravariant, that G, or G” is defined from injective 
resolutions, and that F carries projectives to injectives. Prove that G o F is 
one-sided exact and that its derived functors satisfy (G o F)" = G" o F or 
(GoF),=G,oF. 
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8. Let G be a group, and let F = (Ft — Z) be a free resolution of the trivial 
ZG module Z in the category ZG. If M is an abelian group on which G acts by 
automorphisms, then we know that the cohomology H"(G, M) is defined to be 
the n" cohomology of the cochain complex Homzg(F*, M) and the homology 
H,(G, M) is defined to be the n" homology of the chain complex F+ @zg M. 
Take for granted the result of Proposition 3.32 that if G is a finite cyclic group 
with generator s, then 


’>ZG SIG Eh gas ME 7G ’>ZG “,Z > 0 


is afree resolution of ZG, where T and N are the left ZG module homomorphisms 
defined by 


T = multiplication by (s) — (1), 
N = multiplication by (1) + (s) +---+ (1), 


Prove that H"(G,M) X H"+2(G, M) and H,(G,M) = An42(G, M) for all 
n > 1 and all M when G is a finite cyclic group. 


Problems 9-11 concern changes of rings. Fix a homomorphism p : R — S of rings 
with identity. This homomorphism determines three functors of interest, denoted by 
FR : Cs > Cr, Pa: Cr > Cs, and Ip : Cr > Cz. The first takes an S module M 
and makes it into an R module Fe (M) by the definition rm = p(r)m forr € R and 
m € M; the effect on an S homomorphism is to leave the function unchanged and to 
regard it as an R homomorphism; this functor is manifestly exact. For the second, 
regard S as an (S, R) bimodule with right R action given by sr = sp(r), and define 
Pr(M) = S @r M for M in Obj(Cr) and PR) = 1s ®@ for g in Homr(M, N); 
this functor is covariant and right exact. For the third, regard S as an (R, S) bimodule 
with left R action given by rs = p(r)s, and define T3(M) = Homar(S, M) for M in 
Obj(Cr) and I3(9) = Hom(l1s, ¢) for g in Homr(M, N); this functor is covariant 
and left exact. 


9. IfCand D are good categories of modules and if F :C ~ DandG: D > Care 
covariant additive functors such that there exist isomorphisms of abelian groups 


Hom(F (A), B) = Hom(A, G(B)) 


natural for A in Obj(C ) and for B in Obj(D), then F is said to be left adjoint to 

G and G is said to be right adjoint to F. 

(a) Prove that if G carries onto maps in D to onto maps in C, then F' carries 
projectives in C to projectives in D. 

(b) Prove that if F carries one-one maps in C to one-one maps in D, then G 
carries injectives in D to injectives in C. (Educational note: The conclusions 
in this problem extend to any abelian categories C and D, and in this enlarged 
setting, (b) follows from (a) by duality.) 
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10. (a) Prove that Pe is left adjoint to Fe. 

(b) Deduce from the previous problem that P2 sends projectives in Cr to pro- 

jectives in Cs. 

(c) Prove that if the right R module S is projective, then Pp is exact. (Ed- 
ucational note: In the subject of Lie algebra homology and cohomology, 
this hypothesis is satisfied when S is the universal enveloping algebra of a 
Lie algebra g over a field K, R is the universal enveloping algebra of a Lie 
subalgebra h of g, and p : R — S is the inclusion. It is satisfied also in the 
subject of homology and cohomology of groups if S is the group algebra KG 
of a group G over a field K and if R is the group algebra KH of a subgroup 
H. See Problem 13c below.) 

Using Problem 7, prove that if the right R module S is projective, then 

Exti(P2M, N) = Ext4(M, FEN) naturally in each variable (M being in 

Obj(Cr) and N being in Obj(Cs)). 

(e) Even without the assumption that the right R module S is projective, let 
X = (X* > M) be a projective resolution of a module M in Cp, and let 
Y=(v¥t> PM) be a projective resolution of PM in Cs. Construct a 
chain map from PX to Y extending the identity map on P; M, and use it to 
obtain the associated homomorphism Ext) (P3 M,N) > Ext’, (M, FEN ) 
natural in each variable. 


11. (a) Prove that J is right adjoint to F%. 

(b) Deduce from Problem 9 that J . sends injectives in Cr to injectives in Cs. 

(c) Prove that if the right R module S is projective, then 7; is exact. 

(d) Using Problem 7, prove that if the right R module S is projective, then 
Ext((M, IRN) = Exth (FRM, N) naturally in each variable (MV being in 
Obj(Cs) and N being in Obj(Cp)). 

(e) Even without the assumption that the right R module S is projective, let 
X = (Xt > N) be an injective resolution of a module N in Cr, and let 
Y = (Yt +> IRN) be an injective resolution of 7;N in Cs. Construct a 
chain map from Y to 12N extending the identity map on /;N, and use it 
to obtain the associated homomorphism Ext\(M, IN) > Ext,(F%8M, N) 
natural in each variable. 


(d 


wm 


Problems 12-13 concern the effect on cohomology of groups of changing the group. 
The main result is the exactness of the “inflation-restriction sequence”; this is applied 
particularly in algebraic number theory to relate Brauer groups (see Chapter III) for 
different field extensions. Let J and K be groups, and let p : J — K bea group 
homomorphism. By the universal mapping property of group rings, p extends to 
a ring homomorphism, also denoted by p, from ZJ into ZK. For any group G, 
we make use of the standard free resolution F(G) = (F(G)* soy Z) of Z in the 
category Cz, as described before Theorem 3.20. A Z basis of F,,(G) consists of all 
tuples (go, ..., 8n), and a ZG basis consists of those members of the Z basis with 
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go = 1. In the context of the groups J and K, any ZK module M becomes a ZJ 
module by the formula xm = p(x)m for x € ZJ andm é€ M. In particular, each free 
ZK module F,,(K ) can be regarded as aZJ module. Meanwhile, the homomorphism 
p:J — K induces a function from the ZJ basis of F,,(J) into F,,(K ) by the formula 
ed, fi,---, Jn) = A, oC), ---, OGn)) for fi,.--, jn € J, and this extends to a 
ZJ homomorphism, still called p, of F,(J) into F,(K). A look at the formula for 
the boundary operators 0; and Ox in Section III.5 shows that p is a chain map in 
the sense that 0xo = e0,. If M is any unital left ZK module, then it follows that 
Hom(p, 1) : Hom(F(K), M) > Hom(F (J), M) is a cochain map. Consequently 
we get maps on cohomology foreachn of the form H”(p) : H"(K, M) — H”"(J, M). 
There are two cases of special interest: 

(i) If o : H > Gis the inclusion of a subgroup into a group, then the mapping 

on cohomology is called the restriction homomorphism 


Res: H"(G, M) > H"(H, M). 


(ii) If H is a normal subgroup of G, let op : G — G/H be the quotient 
homomorphism. For any ZG module M, let M” be the subgroup of H 
invariants. Then G/H acts on M”. The above construction is applicable 
to the module M” for the group ring Z(G/H) of G/H, and we form the 
mapping on cohomology H"(G/H, M") > H"(G, M“"). The inclusion of 
the ZG module M” in M induces a mapping H"(G, M") > H"(G, M), 
and the composition is called the inflation homomorphism 


Inf: H"(G/H, M") > H"(G, M). 


When H is a normal subgroup of G and M is a ZG module and g > 1 is an integer 
such that H‘(H, M) = 0 for 1 < k < q — 1, the inflation-restriction sequence is 
the sequence of abelian groups and homomorphisms 


0 —> H41(G/H, M#) ™S 449(G, M) 8S H4(H, M). 
12. Forg = 1, use direct arguments to prove the exactness of the inflation-restriction 
sequence by carrying out the following steps: 

(a) By sorting out the isomorphism ®, : Homzg(F,,M) — C4(G, M) of 
Section III.5, show that the effect of a homomorphism p : G > G’ on 
C4(G’, M) is given by (p* f)(g1,..-, 8g) = f(e(81),---, 0(8q))- 

(b) Verify that Res o Inf = 0 by looking at cocycles. 

(c) Show that Inf is one-one on H4(G/H, M") by showing that any cocycle 
f : G/H — M® that is a coboundary when viewed as a function on G is 
itself a coboundary for G/H. 

(d) Show that every member of ker(Res) lies in image(Inf) by showing that any 
cocycle f : G — M whose restriction to H is a coboundary may be adjusted 
to be 0 on A and that an examination of the equation f (st) = f(s) +sf(@) 
in this case shows f to be a cocycle of G/H with values in M”. 


9. Problems 255 


13. Assume inductively that g > 1, that H*(H, M) = 0Oforl <k < q — 1, and that 
the inflation-restriction sequence is exact for all N for degree g — 1 whenever 
H*(H,N) =Oforl<k < q —1. Form B = I FM = Homz(ZG, M) as 
in Problems 9-11. Elements of B can be identified with functions g on G with 
values in M, and G acts by (gov)(g) = (ggo). 


(a) 


(b) 
(c) 
(d) 


(e) 


(f) 


(g) 
(h) 


(i) 


For m € M, show that the function ¢,, (tf) = tm is aone-one ZG homomor- 
phism of M into B. If N = B/M, then the sequence 0 — M > B — 
N — Ois therefore exact in Czc. 

Use Problem 11 to verify that H‘(G, B) = Ext! (Z, FEM), and deduce 
that H*(G, B) =O fork > 1. 

Verify the equality of right ZH modules ZG = A @z ZH for some free 
abelian group A. 

Using (c), show that FEAB > Homz(ZH, Homz(A, M)), and deduce that 
H*(H, B) =O fork > 1. 

Using the hypothesis that H'(H, M) = 0 and a long exact sequence asso- 
ciated to the short exact sequence in (a), show thatO > M% > BY — 
N# -> Ois exact. 

Prove that Z @zy ZG = Z(G/H) as right ZG modules, where Z(G/H) is 
the integral group ring of G/H. 

Show that B! = 17'°/"? M, and deduce that H‘(G/H, B") = 0 fork > 1. 
Using the long exact sequences for G and for H associated to the short exact 
sequence of (a), as well as the long exact sequence for G/H associated to 
the short exact sequence of (e), establish isomorphisms of abelian groups 


H1"'(G/H, N") = H4(G/H, M"), 
H1-!(G, N) = H4(G, M), 
H1-!(H, N) = H4(H, M). 


Set up the diagram 


0 —— H*7-!(G/H,N”) ——> H*!(G,N) —— H‘%'(H,N) 


{ { { 


0 ——> #H1(G/H,M") ——> H4(G,M) —— #H1(H,M) 


show that it is commutative, and deduce from the foregoing that the 
inflation-restriction sequence is exact for M in degree q. (Educational note: 
For an application to Brauer groups, let F C K C L be fields, and assume 
that K/F, L/F, and L/K are all finite Galois extensions. The groups in 
question are G = Gal(L/F), H = Gal(L/K), and G/H = Gal(K/F), and 
the modules in question are M = L* and M" = K™. The index q is to 
be 2, and the vanishing of H! is by Hilbert’s Theorem 90. The conclusion 
is that the sequence 0 > B(K/F) > B(L/F) — B(L/K) is exact.) 
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Problems 14—16 introduce the cup product in the cohomology of groups. This is a 
construction having applications to topology and algebraic number theory. Let G 
be a group, and form the standard free resolution F = (Ft —+ Z) of Z in the 
category Czc, as described before Theorem 3.20. A Z basis of F,, consists of all 
tuples (go, ..., 8,), and a ZG basis consists of those members of the Z basis with 
go = 1. Let 0 denote the boundary operator, with the subscript dropped that indicates 
the degree. Define @),4 : Fpiqg > Fp @z Fa by 


Yp,q (805 tee Sp+q) = (go, tees 8p) ® (Zp, tee &q): 
14. Check that (€ ® €) 0 goo = € and that each g,,, with p > Oandq > Oisa ZG 
homomorphism satisfying 


Pp.q 09 = (9 @ 1) 0 Pprig + (—1)?C ® 9) 0 Ppq4t- 


15. If A and B are abelian groups on which G acts by automorphisms, show that G 
acts by automorphisms on A @z B in such a way that g(a @ b) = ga © gb for 
alla € A,b € B,g € G. Thus whenever A and B are unital left ZG modules, 
then so is A @z B. 


16. For any unital left ZG module M, we work with Homzc(F;,, M) as the space of 
n-cochains. (Here it is not necessary to unravel the isomorphism given in Section 
IIL.5 that relates Homzc(F;,, M) to the space C”(G, M) of cochains defined in 
Chapter VII of Basic Algebra.) Define the coboundary operator on the complex 
Homzg(F*, M) to be d = Hom(0, 1). For any unital left ZG modules A and 
B, let f € Hom(F,, A) and g ¢ Hom(F;, B) be given. The product cochain 
f -g is the member of Homzg(Fp+4,, A @z B) given by f-g = (f @g)o@pq. 
(a) Check that f-g = (df)-g+(-l’f : (dg). 

(b) How does it follow that this product descends to a homomorphism of abelian 
groups a @ b +> a Ub carrying the space H?(G, A) ®z H4(G, B) to 
H?*4(G, A @z B)? The descended mapping is called the cup product. 

(c) Explain why the cup product is functorial in each variable A and B. 

(d) Explain why the cup product for p = 0 and gq = 0 may be identified with 
the mapping on invariants given by A° @ BS > (A @z B)®. 

Problems 17-20 introduce flat R modules, R being aring with identity. These modules 

are of interest in topology and algebraic geometry. Let R° be the opposite ring of 

R; right R modules may be identified with left R° modules. Let Cr be the category 

of all unital left R modules; tensor product over R can be regarded as a functor in 

the second variable, carrying Cr to Cz, or as a functor in the first variable, carrying 

Cro to Cz. A unital right R module M (i.e., a unital left R° module) is called flat if 

M ®pr(-) is an exact functor from Cr to Cz. Since this functor is anyway right exact, 

M is flat if and only if tensoring with M carries one-one maps to one-one maps, i.e., 

if and only if whenever f : A — Bis one-one, thenly@®f:M@rA> M@rRrB 

is one-one. Take as known the analog for the functor Tor of all the facts about Ext 

proved in Section 7. 
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17. Prove for unital right R modules that 
(a) the right R module R is flat, 
(b) adirect sum F = Byes F, is flat if and only if each F; is flat, 
(c) any projective in Cpe is flat. 


18. Let M be a unital right R module. For each finite subset F of M, let Mr be the 
right R submodule of M generated by the members of F’. Prove that M is flat if 
and only if each Mf is flat. 


19. Let B be in Cr, write B as the R homomorphic image of a free left R module F,, 
and form the exact sequence 0 > K — F — B — 0 in which K is the kernel 
of F — B. Prove for each unital right R module A that the sequence 


0 > Tor{'(A, B) > A@r K > A@rF > A@RB—>0 


is exact. Deduce that A is flat if and only if Tort (A, B) = 0 for all B. 


20. Suppose that R is a (commutative) principal ideal domain, so that in particular 
R = R°. The torsion submodule 7 (M) of a module M in Cr consists of all 
m € M withrm = 0 for somer 4 0 in R. 

(a) Suppose that M is of the form M = F @ T(M), where F is a free R 
module. Using the exact sequence 0 — F — M — T(M) — 0, prove 
that Tor’ (M, B) = Tor{(T(M), B) for all modules B in Cr. 

(b) Deduce from (a) and Problem 18 that a module M in Cp is flat if and only if 
T (M) is flat. (Note that M is not assumed to be of the form F 6 T(M).) 

(c) By comparing the one-one inclusion (a) C R for a nonzero a € R with the 
induced map from (a) @r M to R @pr M, prove that T(M) 4 0 implies M 
not flat. 

(d) Deduce that a module M in Cp is flat if and only if M has 0 torsion, i.e., if 
and only if M is torsion free. (Educational note: In combination with the 
result of Problem 19, this condition explains the use of the notation “Tor” 
for the first derived functor of tensor product.) 


Problems 21-25 deal with double chain complexes of abelian groups. A double 

chain complex is a system {E,,,} of abelian groups defined for all integers p and g 

and having boundary homomorphisms 0), : Ep,g > Ep-1,q and 0f : Ep.q > Ep.g-1 

such that D1) 24.g% 09 = 0, Dg 19 pa = 0, and Gr gy Opa + O51. % pa = 0. This set 
of problems will assume that E,,, = 0 if either p or q is sufficiently negative. 

21. Let{E,,,} be adouble complex of abelian groups with boundary homomorphisms 
as above, let E, = Diss Ey,q, and define 0, : E, — E,_1 by On|, = 
ae + Oe as Show that the maps 0, make the system {£,} into a chain complex. 
(Note: The indexing on the boundary maps has been changed by | from earlier in 
the chapter in order to simplify the notation that occurs later in these problems.) 
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22. Let C; be a good category of unital left R modules, and let C, be a good category 
of unital left R° modules; the latter modules are to be regarded as unital right 
R modules. Let C = {Cp}p>-o. and D = {Dg}q>-o0 be chain complexes with 
boundary maps a, : Cp > Cp_1 inC, and Bg : Dg > Dg-1 inC;. It is assumed 
that C, = 0 for p sufficiently negative and that D, = O for q sufficiently 
negative. Define E,,.q = Cp @r Dy, dng = ap @ 1, and Ona = (-1)?(1 @ By). 
Prove that {£,,,} with these mappings is a double complex of abelian groups. 
(Educational note: Therefore the previous problem creates a chain complex 
{E,} with boundary maps 0, : E, — E,—; from this set of data. One writes 
E =C @p D for this chain complex and calls it the tensor product of the two 
chain complexes.) 


23. In the notation of the previous problem, suppose that C, = 0 if p < O and 
D, = Oifq <0. Let Z, = kera, and Zi = ker B,. Prove that if c is in Z, 
and d is in Z,, then c @d is in the subgroup ker(0,, + 95,,) of Ep,q and that as 
a consequence, there is a canonical homomorphism of H?(C) ®r H4(D) into 
H?*4(C @p D). 

24. Suppose that a double complex E,,, of abelian groups has Ep, = 0 if p < —1 or 
q < —lor p =q =—1. Suppose further that E., is exact for each gq > 0 and 
E,,. is exact for each p > 0. Prove that the r™ homology of E_\,q as q varies 
matches the r homology of E p,-1 a8 p varies. To do so, start from a cycle a 
under 0” in E_1., with k > 0. It is mapped to 0 by 0’, hence has a preimage a’ 
under 0’ in Eo,.. The element 0a’ in Eo,x-1 is mapped to 0 by 0’, hence has a 
preimage a” in FE; ,-1. Continue in this way, and arrive at a cycle in Ex 9. Then 
sort out the details. 


25. With notation as in Problem 22, let A be in C,, and let B be inC;. Let C = 
(C+ — A) be a projective resolution of A, and lett D = (D* — B) bea 
projective resolution of B. Form E = C @ » D as in Problem 22, and apply 
Problem 24 to give a direct proof (without the machinery of Section 7) that one 
gets the same result for Tor® (A, B) by using a projective resolution in the first 
variable as by using a projective resolution in the second variable. 


Problems 26-31 concern the Kiinneth Theorem for homology and the Universal 
Coefficient Theorem for homology. Both these results have applications to topology. 
It will be assumed throughout that R is a (commutative) principal ideal domain. 


STATEMENT OF KUNNETH THEOREM. Let C and D be chain complexes 
over the principal ideal domain R, and assume that all modules in negative 
degrees are 0 and that C is flat. Then there is a natural short exact sequence 
0+ @® (Ay(C) @r Hy(D)) —> Hn(C @r D) 
p+q=n 
PS @ Tork(H,(C), Hy(D)) > 0. 
p+q=n-1 

Moreover, the exact sequence splits, but not naturally. 
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The point of the theorem is to give circumstances under which the homology of 
each of two chain complexes C and D determines the homology of the tensor product 
E = C@gD, the tensor product complex being defined as in Problem 22. Problem 26 
below shows that some further hypothesis is needed beyond the limitation on R. A 
sufficient condition is that one of C and D, say C, be flat in the sense that all 
the modules in it satisfy the condition of flatness defined in Problems 17—20. The 
problems in the set carry out some of the steps in proving the Ktinneth Theorem, and 
then they derive the Universal Coefficient Theorem for homology as a consequence. 
To keep the ideas in focus, the problems will suppress certain isomorphisms, writing 
them as equalities. 


26. With R = Z, let C = D be the chain complex with Co = Z/2Z and withC, = 0 
for p #0. Let C’ be the chain complex with Ch = Z, with C| = Z, and with 
C’, = 0 for p > 1 and for p < 0. Let the boundary map from C; to Cg be 
x2. Compute the homology of C,C’, D,C ®@z D, and C’ @z D, and justify the 
conclusion that the homology of each of two chain complexes does not determine 
the homology of their tensor product. 


27. Let 0’ be the boundary map for C. Show how to set up an exact sequence 


Qa’ 


Os 7 KA oe Ss BS 0 


of complexes in which each module in Z is the submodule of cycles of the 
corresponding module in C, 1 is the inclusion, B is the complex of boundaries, 
and B’ is B with its indices shifted by 1. Why does it follow from the fact that 
C is flat that Z, B, and B’ are flat? 


28. Explain why 


O— Fe. D2 Caen i Ree b 36 


is exact even though D is not assumed to be flat. 


29. The long exact sequence in homology corresponding to the short exact sequence 
in the previous problem has segments of the form 
! On 4,@1 
Hyii(B’ @pg D) —*> H,(Z @p D) > H,(C @r D) 


3, @1 ; On-1 
— H,,(B’ @r D) ——> H,_|(Z @pr D). 


Let 9” be the boundary map for D, and let Z, B, and B’ be the counterparts for 

D of the complexes Z, B, and B’ for C. Show that 

(a) the boundary map in B’ @r D may be regarded as 1 ® 0” because the 
boundary map in B’ is 0. 
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(b) ker(1 @ 0”), = (B’ @p Z), and image(1 @ O”)n41 = (B’ @r B), because 
B’ is flat. 

(c) H,(B’ @r D) = (B @r H(D))n_-1 because B’ is flat. (This isomorphism 
will be treated as an equality below.) 

(d) similarly H,(Z ®prD) = (Z@pr H(D)),. (This isomorphism will be treated 
as an equality below.) 


Form an exact sequence 


0— B—- Z— H(C) — 0 


of complexes, form the low-degree part of the long exact sequence corresponding 
to applying the functor (-) ®r H(D), namely 


0 > Tor (H(C), H(D))n > (B @r H(D))n 
> (Z @r H(D))n > (H(C) @r H(D))n > 0, 


and rewrite it by (c) and (d) of Problem 29 as 


0 — Tor®(H(C), H(D))n > Hnsi(B’ @ D) 


ors H,(Z @pr D) —> (H(C) @p H(D))n > 0. 


(a) Why is the term Tort (Z, H(D)) in the long exact sequence equal to 0? 
(b) In the 5-term exact sequence of Problem 29, rewrite the part of the sequence 
centered at the map 0/ @ | in such a way that two exact sequences 


SES AAC Sp D) = coke @ 0 
and 
0 ==% koran. =" HR © D) ES BZ WED) 


result. Why can the group ker w,_; and the homomorphism i be taken to be 
Tor{ (H(C), H(D))n—1 and B34? 

(c) Why in (b) can coker(t, ® 1) and q be taken to be Tor? (H(C), H(D))n-1 
and some one-one homomorphism £,—1 such that 6’; Bn—1 = 9), @ 1? 

(d) Arguing similarly with the map , @ | in Problem 29, obtain a factorization 
tn ® 1 = a, at, in which a, : (Z @r H(D))n > (H(C) @r H(D)), is onto 
anda, : (H(C) @r H(D))n > Ai(C @r D) is one-one. 


31. 
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(e) The maps a, and f,-; having now been defined in the sequence in the 

statement of the Kiinneth Theorem, prove that the sequence is exact. 
(Universal Coefficient Theorem) By specializing D in the statement of the 
Kiinneth Formula to a chain complex that is a module M in dimension 0 and is 0 
in all other dimensions, obtain the natural short exact sequence 


0 — H,(C) @x M — H,(C @r M) — Tor} (H,-1(C), M) — 0, 
valid whenever R is a principal ideal domain and C is a chain complex whose 


modules are all 0 in dimension < 0. (Educational note: The exact sequence 
splits, but not naturally.) 


Problems 32—35 concern abelian categories. 


32. 


33. 


34. 


35. 


Let C be an abelian category. Let D be the category for which Obj(D) consists of 
all chain complexes of objects and morphisms in C and for which Morph(X, Y) 
for any two objects X and Y in D consists of all chain maps from X to Y. Prove 
that D is an abelian category. 


Consider the snake diagram in the category of all abelian groups consisting of the 
four rightmost groups in the first row and the four leftmost groups in the second 
row of the following commutative diagram: 


0 S57 Sa tg AEE IG aa ok i) 
lmod8 
[x [2 t>2mod4 

0 9 Sg SE AG 


Adjoin the 0’s to make the diagram become what is displayed. Following the 
steps in the example of a diagram chase in Section 8, extend this diagram to the 
auxiliary diagram that appears in that discussion, and show that (Bo, k) for the 
extended diagram is not a kernel of 6. 


For a general abelian category C and any M in Obj(C), verify that Hom(-, M) 
is a left exact contravariant functor from C to Cz and Hom(M, -) is a left exact 
covariant functor from C to Cz. 


Proposition 4.19 shows for any good category C of unital left R modules that a 
module P in C is projective for C if and only if Hom(?, -) is an exact functor, 
if and only if every short exact sequence 0 > X > Y —> P -— 0 splits. 
Rewrite this proof in such a way that it applies to arbitrary abelian categories 
C. For the step in the argument that the splitting of every short exact sequence 
0— X > Y > P = Oimplies that P is projective, use the notion of pullback 
that is developed in Section 8. 


CHAPTER V 


Three Theorems in Algebraic Number Theory 


Abstract. This chapter establishes some essential foundational results in the subject of algebraic 
number theory beyond what was already in Basic Algebra. 

Section | puts matters in perspective by examining what was proved in Chapter I for quadratic 
number fields and picking out questions that need to be addressed before one can hope to develop a 
comparable theory for number fields of degree greater than 2. 

Sections 2-4 concern the field discriminant of a number field. Section 2 contains the definition of 
discriminant, as well as some formulas and examples. The main result of Section 3 is the Dedekind 
Discriminant Theorem. This concerns how prime ideals (p) in Z split when extended to the ideal 
(p)R in the ring of integers R of a number field. The theorem says that ramification, i.e, the 
occurrence of some prime ideal factor in R to a power greater than 1, occurs if and only if p divides 
the field discriminant. The theorem is proved only in a very useful special case, the general case 
being deferred to Chapter VI. The useful special case is obtained as a consequence of Kummer’s 
criterion, which relates the factorization modulo p of irreducible monic polynomials in Z[X] to the 
question of the splitting of the ideal (p)R. Section 4 gives a number of examples of the theory for 
number fields of degree 3. 

Section 5 establishes the Dirichlet Unit Theorem, which describes the group of units in the ring 
of algebraic integers in a number field. The torsion subgroup is the subgroup of roots of unity, and 
it is finite. The quotient of the group of units by the torsion subgroup is a free abelian group of a 
certain finite rank. The proof is an application of the Minkowski Lattice-Point Theorem. 

Section 6 concerns class numbers of algebraic number fields. Two nonzero ideals / and J in the 
ring of algebraic integers of a number field are equivalent if there are nonzero principal ideals (a) 
and (b) with (a)/ = (b)J. Itis relatively easy to prove that the set of equivalence classes has a group 
structure and that the order of this group, which is called the class number, is finite. The class number 
is 1 if and only if the ring is a principal ideal domain. One wants to be able to compute class numbers, 
and this easy proof of finiteness of class numbers is not helpful toward this end. Instead, one applies 
the Minkowski Lattice-Point Theorem a second time, obtaining a second proof of finiteness, one that 
has a sharp estimate for a finite set of ideals that need to be tested for equivalence. Some examples 
are provided. A by-product of the sharp estimate is Minkowski’s theorem that the field discriminant 
of any number field other than Q is greater than 1. In combination with the Dedekind Discriminant 
Theorem, this result shows that there always exist ramified primes over Q. 


1. Setting 


It is worth stepping back from the results of Chapter I to put matters into perspec- 
tive. Chapter I studied three problems, all of which could be stated in terms of 
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elementary number theory. These were the questions of solvability of quadratic 
congruences, of representability of integers or rational numbers by primitive 
binary quadratic forms, and of the infinitude of primes in arithmetic progressions. 

We had started from the more general problem of studying Diophantine equa- 
tions, beginning with the observation that solvability in integers implies solvabil- 
ity modulo each prime.! Linear congruences being no problem, we began with 
quadratic congruences and were led to quadratic reciprocity. Then we sought 
to apply quadratic reciprocity to address representability of integers or rational 
numbers by binary quadratic forms. The reasons for studying the infinitude of 
primes in arithmetic progressions were more subtle; what we saw was that at 
various stages in dealing with binary quadratic forms, this question of infinitude 
kept arising, along with techniques that might be helpful in addressing it. 

Work on at least the first two of the problems was helped to some extent by the 
use of algebraic integers, and we shall see momentarily that algebraic integers 
illuminate work on the third problem as well. In any event, it is apparent where 
to look for a natural generalization. We are to study higher-degree congruences, 
perhaps in more than one variable, and we are to use algebraic extensions of the 
rationals of degree greater than 2 to help in the study. 

The situation studied in Section IX.17 of Basic Algebra will be general enough 
for now. Thus let F (X) be a monic irreducible polynomial in Z[X]. Section IX.17 
began to look at the question of how F(X) reduces modulo each prime p. We 
begin by reviewing the case of degree 2, the main results in this case having been 
obtained in Chapter I in the present volume. For the polynomial F(X) = X?—m 
with m © Z, the assumed irreducibility means that m is not the square of an 
integer. For fixed m and most primes p, either F(X) remains irreducible modulo 
p or F(X) splits as the product of two distinct linear factors. The exceptional 
primes have the property that F(X) modulo p is the square of a linear factor; 
these are the prime divisors of m and sometimes the prime 2. In short, they occur 
among the prime divisors of the discriminant 4m of F(X). In terms of quadratic 
residues, the irreducibility of F(X) modulo p means that m is not a quadratic 
residue modulo p, and the splitting into two distinct linear factors means that it 
is. The odd primes for which F(X) modulo p is the square of a linear factor are 
the odd primes that divide m. Modulo 2, every integer is a square, and reduction 
modulo 2 was not helpful. 

The number theory of quadratic number fields sheds additional light on this 
factorization. The relevant field is of course Q(./m ); this is anontrivial extension 
of Q, since m is not square. In working with this field in Chapter I, we imposed 
the additional condition that m be square free. Promising a general definition for 


'Solvability modulo each prime power is also of interest but played a role in Chapter I only for 
powers of 2. 
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later, we defined the field discriminant of Q(,/m ) in that chapter to be 


5 Ge ifm = 2 mod 4orm =3 mod 4, 


m ifm = 1 mod 4. 


Problems 20-24 in Chapter I implicitly related the splitting of F(X) modulo p 
to the factorization of ideals. Let R be the ring of algebraic integers in Q(./m ). 
If p is an odd prime, those problems observed that (p)R is a prime ideal in R if 
D is a nonsquare modulo p, is the product of two distinct prime ideals if D is a 
square modulo p but is not divisible by p, and is the square of a prime ideal if D 
is divisible by p. The factorization of (2)R was more subtle and was addressed 
in Problem 21. 

In any event, the pattern of reducibility modulo p of X* — m, at least when 
the prime p is odd, mirrors the pattern of factorization of the ideal generated 
by p in the ring of algebraic integers in the number field Q(./m). The role 
of quadratic reciprocity was to explain this pattern. Problem 1 at the end of 
Chapter I showed that one qualitative consequence of quadratic reciprocity is that 
the odd primes p for which X? — m remains irreducible are the ones in certain 
arithmetic progressions, and similarly for the odd primes not dividing p for which 
a factorization into two linear factors occurs. 

One objective of a generalization is to produce a corresponding theory for an 
arbitrary monic irreducible polynomial F(X) in Z[X], say of degree n. Let K be 
the extension of Q generated by a root of F(X), and let R be the ring of algebraic 
integers in K. Theorem 9.60 of Basic Algebra shows for each prime number p 
that the decomposition of the ideal (p)R in R as a product of powers of distinct 
prime ideals takes the form (p)R = - Ps with f; = [R/P; : Z/(p)] and 
ye ei fi =n. Meanwhile, F(X) factors modulo p as a product of powers of 
irreducible polynomials modulo p. Sections 2—3 will describe a theory begun 
by Kummer and Dedekind for how the factorization of the ideal (p)R and the 
factorization of the polynomial F(X) modulo p are related. One introduces a field 
discriminant for K that is closely related to the discriminant of the polynomial 
F(X), and a key result, the Dedekind Discriminant Theorem, says that some e; 
is > 1 if and only if p divides the field discriminant. The primes p for which 
some e; is greater than | are said to ramify in the extension field K. These primes 
are not as well behaved as the others, and one’s first inclination might be to try 
to ignore them. However, Problems 25—40 at the end of Chapter I show that the 
ramified primes encode a great deal of information; in particular, they explain the 
theory of genera and the relationship between exact representability of rational 
numbers and representability of integers modulo the field discriminant. 

Generalizations of quadratic reciprocity lie much deeper and are central results 
of the subject of class field theory, a subject that is beyond the scope of the present 
book. Suffice it to say that class field theory in its established form seeks to 
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parametrize all finite Galois extensions of any number field having abelian Galois 
group; the parametrization is to refer only to data within the given number field. 
The reciprocity theorem in this setting goes under the name “Artin reciprocity,” 
which includes quadratic reciprocity as a very special case. Class field theory 
for nonabelian finite Galois extensions is at present largely conjectural, and the 
conjectural reciprocity statement goes under the name “Langlands reciprocity.” 

Beginning in Section I.6, we translated some of the theory of binary quadratic 
forms into facts about quadratic number fields. One tool we needed was a de- 
scription of the units in the ring of algebraic integers within the quadratic number 
field. It is to be expected that a similar description for an arbitrary number field 
will play a foundational role in number theory beyond the quadratic case. The 
description in question is captured in the Dirichlet Unit Theorem, which appears 
as Theorem 5.13 in Section 5. 

The translation of the notion of proper equivalence class of binary quadratic 
forms into the language of quadratic field extensions led to a notion of strict 
equivalence of ideals, as well as a notion of ordinary equivalence. Because there 
are only finitely many proper equivalence classes of forms, there could be only 
finitely many strict equivalence classes of ideals, and this set of classes of ideals 
acquired the structure of a finite abelian group. Dirichlet studied the order of this 
group, which figures into formulas for the value of certain Dirichlet L functions 
L(s, x) at s = 1. The ideal class group for ordinary equivalence is a quotient of 
this group by a subgroup of order at most 2. 

Although we shall not be concerned with representability of integers by forms 
of degree greater than 2, the ideal class group and its order (the “class number” 
of the field) are of interest for general number fields when defined in terms of 
ordinary equivalence, not strict equivalence. Section 6 is devoted to proving that 
the class number is finite for any number field and to developing some tools 
for computing class numbers. Class number 1 is equivalent to having the ring 
of algebraic integers in question be a principal ideal domain. Apart from the 
appearance of class numbers in various limit formulas, here is one other indicator 
of the importance of the ideal class group: It is possible to extend the above theory 
of ramification in such a way that it applies to any extension K/F of number fields, 
not just to finite extensions of Q. Hilbert proved that for any IF, there is a finite 
Galois extension K/F with abelian Galois group that is small enough for the 
extension to be unramified at every prime ideal of F and that is large enough for 
any unramified abelian extension of F to lie in K. Artin reciprocity can be used 
to show that Gal(K/F) is isomorphic to the ideal class group? of F and thus gives 
some control over the nature of K. In particular, K = F if and only if every 
ideal in the ring of integers of F is principal. When F is quadratic over Q, the 


2The field K is called the Hilbert class field of F. The name “class field” is meant to be a 
reminder of this isomorphism. 
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field K can be used to give more definitive results than in Chapter I concerning 
representability of integers by binary quadratic forms. 


2. Discriminant 


Let us recall some material about Dedekind domains from Chapters VIII and IX 
of Basic Algebra. A Dedekind domain is a Noetherian integral domain that is 
integrally closed and has the property that every nonzero prime ideal is maximal. 
Any principal ideal domain is an example. Any Dedekind domain has unique 
factorization for its ideals. Theorem 8.54 of the book gave a construction for 
extending certain Dedekind domains to larger Dedekind domains: if D is a 
Dedekind domain with field of fractions F and if K is a finite separable extension 
of F, then the integral closure of D in K is a Dedekind domain R. The hard 
step in the proof, which was not carried out until Section IX.15, was to deduce 
from the separability that R is finitely generated over D. The role of separability 
was to force the bilinear form (a, b) +> Trx;r(ab) to be nondegenerate, and this 
nondegeneracy in turn implied the desired result about finite generation. 

In this section we introduce a tool that captures this last implication in quan- 
titative fashion—that nondegeneracy of the trace form implies that the extended 
domain is finitely generated over the given domain. In a full-fledged treatment of 
algebraic number theory, one might well want to work in this full generality,’ but 
we need less for our purposes: Throughout this section we assume that the given 
Dedekind domain is the ring Z of integers, that K is a number field, and that R is 
the integral closure of Z in K, i.e., R is the ring of algebraic integers within K. 
Let n = [K : Ql] be the degree of the field extension. Since C is algebraically 
closed, we can regard K as a subfield of C. 

The separability of K/Q in combination with the fact that C is algebraically 
closed implies that there exist exactly n distinct field maps 01, ..., 0, of K into 
C; one of them is the identity. Recall how 0), ..., 0, can be constructed: if & is a 
primitive element for K/Q, if F(X) is the minimal polynomial of € over Q, and 
if€ =&, &,..., &, are the n distinct roots of F(X) in C, then o; can be defined 
by o;( 9 cié’) = 17 ik ; onany Q linear combination of powers of §. For 
any 7 = a cjé' in K, primitive or not, the n elements 0; (7) of C are called 
the conjugates of 7 relative to K. They are the roots of the field polynomial of 
over K, and each occurs with multiplicity [K : Q(7)] o 


3For example this full level of generality would be appropriate if one planned ultimately to study 
class field theory. 

4The field polynomial of an element of K is the characteristic polynomial of left multiplication 
on K by the element. This notion is discussed in Section IX.15 of Basic Algebra. 
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Let [ = (v,...,U,) be an ordered basis of K over Q. The symmetric 
bilinear form (u, v) +> Trx/g(uv) determines an n-by-n symmetric matrix Bj; = 
Trx/g(vjv;), and we can recover the form from the matrix B by the formula 
Trx/g(uv) =a'Bbifa= (-) and b = (7) are the column vectors of u and v in 
the ordered basis T’,ie.,ifu = )°j_, giv; and v = )%7_, bjv;. From Section VI.1 
of Basic Algebra, we know that the bilinear form determines a canonical Q linear 
map L from K to its vector space dual by the formula L(u)(v) = Trx/g(uv) and 
that the nondegeneracy of the form® implies that this linear map is one-one onto. 
Moreover, the matrix of L with respect to I’ and the dual basis of [ is B. Thus 
the nondegeneracy implies that the matrix B is nonsingular. The discriminant 
D(I) of the ordered basis I’ is given by 


D(T) = det B, where B is the matrix of (u, v) +> Trz/x (uv) in the basis I. 


Because of the nonsingularity of B, this is a nonzero member of Q. 

Proposition 6.1 of Basic Algebra shows the effect on the matrix B of changing 
the basis. Specifically let A = (w ,, ..., w,) be a second ordered basis, and let C 
be the matrix of the form in this basis, namely Cj; = Trx/q(w;w;). Let the two 


bases be related by w; = ae ajjvj,1.e., let [ajj] = ee . Then the proposition 


c= (15) #(s5): 


Taking determinants and using the fact that a matrix and its transpose have the 
same determinant, we obtain 


gives 


D(A) = D() (<et(,!,))° 


One consequence of this formula is that the sign of D(I’) is independent of I’. 
Another is that the value of D(I) does not depend on the ordering of the n 
members of I’; it depends only on I as an unordered set. 

Now suppose that the members of the ordered basis I’ are in the subring R 
of algebraic integers within K. Bases of K over Q consisting of members of R 
always exist, since we can always multiply the members of a basis of K over Q by 
a suitable integer to get them to be in R. In this case the entries Bj; = Trx/q(ujv;) 
of the matrix of the bilinear form are in Z, and D(T) is therefore a nonzero member 
of Z. 

The field discriminant, or absolute discriminant, of K, denoted by Dx, is 
the value of D(I) that minimizes | D(T’)| for all bases of K consisting of members 


>The nondegeneracy of the trace form for a number field is a transparent result, not requiring 
anything deep from Section IX.15 of Basic Algebra, since any u # O in K has Trg/g (uu!) = 
Trxjo(1) =n £0. 
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of R. This is a nonzero integer. The sign of Dx is well defined, since all values 
of D(I) have the same sign.® 


Fix an ordered basis = (v,,..., v,) of K, and consider the abelian group 
consisting of the Z span Z(I') of the members of I’. This is evidently a free 
abelian group of rank n. If an ordered basis A = (w,..., Wy) has the property 


that Z(A) C Z(L), then the theory in Section IV.9 of Basic Algebra that leads 
to the Fundamental Theorem of Finitely Generated Abelian Groups shows that if 


we write formally 
Wi al 


then there exist n-by-n integer matrices M, and M) of determinant +1 such that 
D = M,C M,j is diagonal, and moreover the order of Z(T)/Z(A) is | det D| = 


t 
| det C|. Examining the definition of C, we see that C = i) . Consequently 
we obtain 


I 
\Z()/Z(A)| = | det ( 1.) | 

a formula we shall use repeatedly in this chapter without specific reference. 
Proposition 5.1. If T is a basis of K over Q whose members all lie in R, 


then |R/Z(V)|? = D(T)/Dx. In particular, Fis a Z basis of R if and only if 
D(T) = Dx. 


REMARKS. We already know from Basic Algebra that R is a free abelian 
group of rank n. The second conclusion of this proposition, in combination with 
the transparent observation that the trace form is nonsingular for a number field, 
gives a more direct proof of this fact. Introductory treatments of algebraic number 
theory sometimes give this more direct proof, whose details are spelled out in the 
second paragraph below. 


PROOF. Let A and Q be two bases of K over Q whose members all lie in R, 
and suppose that Z(A) C Z(Q). Then the above discussion shows that 


2 
|D(A)| = |D(2)| (det ( ¢/, )) 

and that ‘ 
|Z(2)/Z(A)|” = (det (1) 

Since D(A) and D(&2) are nonzero and have the same sign, we obtain 
D(A)/D(Q) = |Z(Q)/Z(A)]’. (*) 


® As was observed above, any D(A) is the product of D(I’) and the square of a rational number. 
Hence D(A) and D(T) have the same sign. 
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To prove the proposition, we prove the “if” part of the second conclusion 
first— without using the known fact that R is free abelian. Choose A such that 
D(A) = Dx and such that A has all its members in R. Arguing by contradiction, 
suppose that A fails to be a Z basis of R. Let r be an element of R not in Z(A). 
Then the Z span of Z(A) U {r} is a finitely generated additive subgroup of K and 
must be free abelian of rank > n. Being a subgroup of the additive group of K, 
it cannot have rank greater than n and hence has rank exactly n. Let Q be an 
ordered Z basis of this subgroup. Since Z(A) c Z(Q), the right side of («) is 
> 1, and thus Dx > D(2). But this is a contradiction because the members of 
Q lie in R, and hence A is a Z basis of R. In particular, a Z basis of R exists. 

To prove the rest of the proposition, take Q in (*) to be a Z basis of R, 
and let A = T be any given basis of K over Q that lies in R. Then («) gives 
|R/Z(V)|*? = D(L)/D(Q). Since |R/Z(L)| cannot be less than 1, |D(I)| cannot 
be less than |D(Q)|. Thus Dx = D(Q), and |R/Z(P)|? = D(T)/Dx. This 
proves the first conclusion of the proposition, and the “only if” part of the second 
conclusion is immediate. 


EXAMPLE. Field discriminant of a quadratic number field. Let K = Q(./m), 
where m is a square-free integer other than 1. From Section I.6 a Z ordered basis 
T of R is given by 


{1, /m} if m = 2 or 3 mod 4, 
LL 4G 2} ifm = 1 mod 4. 


Proposition 5.1 allows us to compute Dx from this information. The matrix whose 

A 3 ‘ 20 2 -1 : 
determinant is Dx in the two cases is ( 6 ) and e Lim+1) ) , respectively, and 
thus 


4m if m = 2 or 3 mod 4, 
DK = ; 
m ifm = 1 mod 4. 


This is the formula that we took as a definition of field discriminant in Section 
16. 


For a general number field K of degree n over Q, there is no easy way to obtain 
a Z basis of R. Instead, one tries to compute Dx and find such a basis at the same 
time by successive refinements. 

The first step is to use the special kind of Q basis of K whose existence is 
guaranteed by the Theorem of the Primitive Element. Specifically one can write 
K = Qé) for some é in K, since K/Q is a separable extension. Possibly after 
multiplying € by a suitably large integer, we may assume that € is in R. Then 
r(é) = {1,é,€7,...,€"7|} is a Q basis of K lying in R. We normally write 
D(&) instead of D(I'(é)) for the discriminant of ['(é). Write €; = o;(&) for the 
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nt conjugate of §. Let B = [B;;] be the matrix whose determinant is D(&). Since 
the trace of an element is the sum of its conjugates, B;; is given by 


By = Trea G18) = one) = ee tee, 
k=1 


and this is of the form }77_, Vi cVj,, where Vix = &—! is an entry of a Vander- 
monde matrix. Therefore 


D(&) = det B = (det V)* = (T] & - &)) = Il & —&”, 


i<j i<j 


which coincides with the discriminant of the field polynomial of € over Q. 


EXAMPLES OF D(&). 

(1) K = Q(é), where &° — € — 1 = 0. This field was studied in Example 1 of 
Section IX.17 of Basic Algebra. The discriminant of the polynomial X° — X — 1 
is 2869 = 19-151, and thus D(é) = 2869. Proposition 5.1 shows that D(é) = 
Dxk?* for some nonzero integer k. Since 2869 is square free, we conclude that 
Dx = 2869. 

(2) K = Q(V2). The minimal polynomial of € = V2 is X? —2, and its roots 
are €, €w, and €w”, where w = e77'/3, Then 


D(é) = & — 0) (E — €a*)? (Eo — Ew”)? = E81 — @ P11 — @”)? (@ — ©)’, 


and this simplifies to D(£) = —2?3°. This quantity is the product of Dx by the 
square of an integer. Thus Dx is one of —3, —12, —27, and —108. 


What happens with Example 2 is typical: a second step is needed to decide 
among finitely many possibilities for Dx. In the general case an induction is 
involved, and Proposition 5.2 below says what is to be done at each step. At the 
end of this section, we shall return to Example 2 and use the proposition to see 
that Dk = —108 is the correct choice. 

Before stating Proposition 5.2, let us interpolate a generalization of the compu- 
tation of D(é) that preceded the above examples. Suppose that l = (a, ..., Qn) 
is any ordered Q basis of K lying in R. Let B = [B;;] be the matrix whose 
determinant is the discriminant of I’. Then we have 


n n 


Bij = Trxja(aiaj) = D0 ox(ajajy) = Yo on (aj )ox(aj) = YO AiR (AD), 
k=l 


where A = [Ajj] is the matrix with A;; = o;(q;), and it follows that 


D(T) = (det[oj(a)])’. 


This formula can be useful for computing D(I") when the conjugates of the a; 
are readily available. 
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Proposition 5.2. Let [ = (v1,..., v,) be an ordered Q basis of K lying in 
R. If the Z span Z(L) of TI is a proper subgroup of R, then there exists a prime 
number p such that p? divides D(T) and such that some member 


vp = po (cry + egv2 +++ + Cp 1UE-1 + UE) 


of K lies in R with 1 < k <nandO <c; < p—1forj < k —1. If such 
an element v, is found, then A = (v1,..., Ug—1, Up, Uey1, +++, Un) has Z(A) 
properly containing Z(I’) with D(A) = p~*D(T). 


REMARKS. A finite computation is involved in finding p and k. On the one 
hand, for given p,at most 1+ p+p?+---+p"~! elements have to be checked for 
integrality. On the other hand, we in principle have to find the field polynomial 
of a certain element of KK in each case and decide whether the coefficients are 
integers, and this computation may be lengthy. See Problem 2 at the end of the 
chapter for an easy example, Problem 16 for a harder example, and Problem 4b 
for a related computation. 


PROOF. Let Z(T’) be a proper subgroup of R, and put m = |R/Z(T)|. Choose 
a Z basis (wj,..., Wy) of R, and write v; = i cjjw; with all cj; ¢ Z. We 
know that | det[c;;]| = m, and we let p be any prime divisor of m. Reducing the 
ci; modulo p, we see that the matrix [c;;] is singular modulo p, and thus there 
exist integers a;,..., a, not all divisible by p such that 


n 
> aicj; = 0 mod p forl <j <n. 
i=l 


Find k with 1 < k < n for which p divides all of ag41,..., @, but not ax, and 
write )>"_, aici; = pl; for integers /;. Then 


k nek n 

di aivi = do Vi aicijwj = dX (pl; — 3 aiCij) Wj, 

j i j=l i=k+1 

and the integer in parentheses on the right side is a multiple of p. Therefore 
r= eee ajv; is exhibited as ps for some s € R. Choose a’ and d; in Z with 
a'a, — dp = 1, and choose c; and d; in Z for each i with i < k — 1 such that 
0<c; < p—1anda’a; — pd; = c;. Then the computation 


k k-1 k-1 k 
pa's=a'r =) aay; = Yo (cit pdi)uitUt+pdg)e = Yo civituetp Y divi 


i=l i=l i=1 i=l 


shows that Om, cju; + vk) =a's— ae d;v; lies in R. 
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Proposition 5.1 shows that any primitive element € of K that lies in R has 
the property that D(€)/Dx is the square of a nonzero integer, and we write this 
quotient as J(&)? with J(&) > 0. One might hope that although some particular 
choice of € fails to have J(€) = 1, some other choice may be found for which 
equality holds. We shall see in Section 4 that for a class of integers m, Q(</m ) 
has such an element & if and only if a certain nontrivial Diophantine equation in 
two variables has a solution. Both cases arise: form = 2, such a é exists, while 
for m = 175, no such & exists. 

But matters can be worse than this for a general K. The quotient J(&)? = 
D(&)/Dx for a primitive element € of K lying in R is sometimes called the 
index of €. One might hope at least that each prime not dividing Dx fails to 
divide the index J (&)* for some €. However, Dedekind showed that there exist 
number fields K and primes p that are common index divisors’ in the sense that 
p divides J(&) for every primitive element & of K lying in R. Specifically he 
showed that p = 2 is such a prime when K is obtained by adjoining to Q a root 
of X7+ X* —2X +8; here Dx = —503. We shall study this example further in 
Section 4. 

Let us now specialize our considerations from general additive subgroups of 
the form Z(T) to those that are ideals in R. 


Proposition 5.3. If / is a nonzero ideal in R, then 


(a) J contains a positive k in Z and 
(b) I additively is of the form J = Z(L) for some Q basis I of K whose 
members lie in R. 


Consequently R/J is a finite ring and satisfies |R/I|? = D(T)/Dx. 


PROOF. Letr be anonzero member of J, and let P(X) be the field polynomial of 
r. Then P(X) is of the form P(X) = X"+dn_1X""!+-+-+a,X+(-1)"Nxo(r), 
has integers for coefficients, and has r as one of its roots. Consequently the 
formula 
(-1)" Ng) =r"! + apr” ++ +a) 


shows that the nonzero integer Nx/g(r) is the product of r by a member of R and 
hence lies in J. This proves (a) with k = |Nx/g(r)|. 

The ideal J additively is a subgroup of R and is thus free abelian of rank at 
most n. By (a), the integer k = |Nx,q(r)| has the property thatkKR CI CR. 
Since R/KR has k” elements, R/J is finite. Therefore J has rank n as an additive 
group and must be of the asserted form Z(I'). This proves (b). The formula 
|R/I|? = D(V)/Dyx is immediate from Proposition 5.1. 


7Terminology varies for this notion. Such primes p are more usually called common inessential 
discriminant divisors or essential discriminant divisors. The very fact that these two more usual 
names appear to contradict each other is sufficient reason to avoid using either name. 
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The absolute norm N (/) of a nonzero ideal J of R is defined to be N(J) = 
|R/I|. This is necessarily a positive integer by Proposition 5.3. To be able to 
work with this notion, we shall make use of the unique factorization of ideals of 
R as given in Theorem 8.55 of Basic RCO That theorem says that such an 
ideal J has a factorization of the form ive =] Py, where the P; are distinct prime 
ideals of R, and that this factorization is unique except for the orien of the factors. 


Proposition 5.4. The absolute norms of nonzero ideals of R have the following 
properties: 
(a) N(R) = 1. 
(b) If 7 C J are nonzero ideals in R, then N(J) divides N(J), and J = J if 
and only if N(J) = N(J). 
(c) If J and J are nonzero ideals in R, then N(IJ) = NUI)N(J). 
(d) If (@) is a nonzero principal ideal in R, then N((a~)) = |Nx/g(@)I. 


PROOF. Conclusion (a) is immediate, and so is most of (b). If J C J and 
N(J) = N(J), then the First Isomorphism Theorem for abelian groups yields 
(R/1)/(J/D) = R/J, and it follows that NI) /|J/T| = N(J). Since N(/) and 
N(J) are finite, NW) = NCJ) if and only if |J/7| = 1,ie.,if and only if J = J. 

For (c), we begin with the special case that J and J are powers of a nonzero 
prime ideal P. Inductively it is enough to show that N(P") = N(P)N(P*!) 
for k > 1. Since (REP*) [PE *YP?) = R/P*—' as abelian groups, it is enough 
to show that 

[PS'/P*| = |R/PI. (*) 


The ring R operates on the ideal P‘~', carrying P* into itself, and P carries P*! 
into P*. Thus P*—'/P* is a unital module for the ring R/P, which is a field 
because P is maximal. Hence P*~!/P* is a vector space over R/P. Corollary 
8.60 of Basic Algebra shows that this vector space is 1-dimensional, and then (*) 
is immediate. 

For the general case in (c), Corollary 8.63 of Basic Algebra shows that if 
T= geen BP? is the unique factorization of the nonzero ideal J as the product 
of positive powers of distinct prime ideals P;, then R/J = [3 R/ Pp, . Hence 
NUW)= Ts N( Pe ). Because of the special case that is already proved, N(J) = 
Ley N(P;)*%. Then (c) follows in the general case. 


For (d), if [ = (w,,...,u,) is an ordered Z basis of R, then the tuple 
al = (au,,..., @u,) is an ordered Z basis of (a), and we know that N((a@)) = 


|R/(«)| = |Z(P)/Z(a)| = |det (,./,.) |. But (,,.1,.) is just the matrix of the 
Q linear map left-by-q in the Q basis I’, and the determinant of this linear map 
is Nx/q(a) by definition of the norm of an element. 


274 V. Three Theorems in Algebraic Number Theory 


EXAMPLE 2 OF D(é), CONTINUED. For K = Q(V/2), we have seen that 
the discriminant of the K basis ['(</2) is D(V2) = —3327. We are going 
to show that (1, </2, %/4) is a Z basis of R, and then it follows that the field 
discriminant of K is Dx = —3?27. We apply Proposition 5.2. The only primes 
that need testing in that proposition are the ones dividing D(</2), and thus 
we consider p = 2 and p = 3. We want to see that no expression p~!(1) 
or p'(c, + V2) or p'(cy + on /2 + V4) is an algebraic integer for some 
coefficients co and c, between 0 and p — 1. We can discard p! (1) because the 
only rational numbers that are algebraic integers are the members of Z. If the 
field polynomial over Q of some & in K is X* + a)X* + a,X + ao, then the 
field polynomial of p~'é is X? + p~!a)X? + p~7a,X + pag. So the question 
of integrality is one of divisibility of the coefficients of the field polynomials of 
certain algebraic integers € by suitable powers of p. These coefficients, up to sign, 
are the values of the elementary symmetric polynomials on the three conjugates 
of é. 

In the case at hand, only the coefficient ap is needed. That is, it is enough to 
see that the norm of & is never divisible by 8 or 27 for € equal to c, + V2 or 
cy tooV2 + V4 as above. Let us write —&=c,; +0o0+0367 withd = ./2 and 
with cy, c2,c3 in Z. Then dg = —Nx/qQ(&), and the norm is the product of the 
three conjugates of €. If w = e*”'/>, we compute that 


Nol) = (c1 + €26 307)(c1 + C200 + 3070") (C1 + €200* + 63070) 


= (c} 2c? 4c?) + 20c1C203(2@ + 30° + w*) 


=(c 23 4c3) 6¢10203. 


For p = 2, we consider this expression when c1, c2, c3 are chosen from {0, 1}. 
To get divisibility by 8, we check this expression modulo 8. Each c} is c; for 
c; € {0,1}. Looking at the expression modulo 2, we see that c; must be even, 
le., c} = 0. Then 8 must divide 2c + Aes, and we obtain cp = c3 = O, in 
contradiction to the formulas for the €’s under consideration. 

For p = 3, it is enough to consider this expression when c1, C2, c3 are chosen 
from {—1,0, +1}. Since each c; has |c;| < 1, we see that |Nxjg(&)| < 13, 
and divisibility by 27 can occur only if Nx/g(&) = 0, which we know entails 
€ = 0. Thus no é meets the test of Proposition 5.2, and the conclusion is that 


(, 3, /4) isa Z basis of R in Q(V2). 
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The field discriminant plays a role in determining how a prime ideal (p) in Z, 
p being a prime number, splits when one extends (p) to an ideal (p)R in the 
ring R of algebraic integers in a number field K of degree n over Q. In this 
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situation, recall from Theorem 9.60 of Basic Algebra that the prime factorization 
of the ideal (p)R in R is of the form (p)R = []7_, P,“ with 74_, e; f; =n; here 
n =[K: QJ, the P; are distinct, and f; = dimr,, (R/P;). The integers e; are 
called ramification indices, and the integers f; are called residue class degrees. 
The extension K/Q is said to be ramified at p, and the prime p of Z is said to 
ramify in K, if some e; is > 1 in this decomposition.® 


Theorem 5.5 (Dedekind Discriminant Theorem). The prime p of Z ramifies 
in a number field K if and only if p divides the field discriminant Dx of K. 


In this chapter we shall prove this theorem only in a useful special case, namely 
in the case that p is not a common index divisor. Only finitely many primes can 
divide the index J (€) = (D(€)/Dx)!/ fora single primitive element é of K lying 
in R, and thus there are only finitely many common index divisors.” Consequently 
the special case that we are proving implies that only finitely many primes of Z 
ramify in K. 

The difficulty in proving Theorem 5.5 in full generality is that we lack sufficient 
tools for addressing questions by localization. At the end of this section, we shall 
make some comments about how one can proceed with further tools. 

As we shall see later in this section, Theorem 5.5 for primes that are not 
common index divisors is an easy consequence of the following theorem. 


Theorem 5.6 (Kummer’s criterion). Let K be a number field, and let R be its 
ring of algebraic integers. Suppose that F(X) is a monic irreducible polynomial 
in Z[X], that € is a root of F(X) in C, and that p is a prime number that does 
not divide the integer J() such that J(€)* = D(€)/Dx. Write F(X) for the 
reduction of F(X) modulo p, let 


F(X) = Fy(X)".-- F(X) 


be the unique factorization of F(X)inF pLX] into a product of powers of distinct 
irreducible monic polynomials, and let f; = deg(F;). For eachi with 1 <i < g, 
select a monic polynomial F;(X) in Z[X] whose reduction modulo p is F;(X), 
and let P; be the ideal in R defined by 


P, = pR+ F;(&)R. 
Then the P;’s are distinct prime ideals of R with dimg, (R/P;) = fj, and the 
unique factorization of (p)R into prime ideals is 


(DR SPo aap? 


8More generally “relative discriminants.” which we have not defined, play a role in the splitting 
of prime ideals in passing from a general number field to a finite extension. The cited Theorem 9.60 
applies in this more general situation as well. This more general topic will be discussed further in 
Problems 5-9 at the end of this chapter and very briefly in Chapter VI. 

°Tn fact, it can be shown that every common index divisor is less than [K : Q]. 


276 V. Three Theorems in Algebraic Number Theory 


REMARKS. The additive group Z(I'(€)) generated by the powers of € through 
é"—! is a ring, since &” is an integral combination of the lower powers of &, and 
this ring has index J (&) as a subring of R. We divide the proof into two parts. The 
first part will give a complete proof in the special case that the subring Z(T' (&)) is 
all of R, but we shall retain notation that distinguishes the subring from the whole 
ring in order to see how much of the proof works for the general case. After the 
first part we pause for a lemma that will be used to tie results for the subring to 
results for all of R, and then we return to apply the lemma and complete the proof 
of Theorem 5.6. 


FIRST PART OF PROOF. Let P/ be the ideal pZ[X] + F;(X)Z[X] in Z[X]. The 
passage from Z[X] to the quotient Z[X]/P/ can be achieved in two steps, first 
using the substitution homomorphism carrying Z to F, and X to itself and then 
taking the quotient by the principal ideal (F;(X)). Since F;(X) is irreducible in 
F,[X], the quotient is a field and P/ has to be prime. The number of elements in 
Z[X]/P; is p/' because deg(Fj(X)) = f;. The ideals P; are distinct because the 
polynomials F';(X) are distinct. 

Meanwhile, the substitution homomorphism of Z[X] leaving Z fixed and 
carrying X to & is a ring homomorphism of Z[X] onto Z(I'(&)). Let P/” be the 
image of P’ under this homomorphism, i.e., let P.” = pZ(T(€))+ Fi (€)Z(I ()). 
This is an ideal. The composite ring homomorphism of Z[X] onto Z(I'(&))/P.” 
factors through to a ring homomorphism of Z[X]/P/ onto Z(T (€))/P,". Since the 
domain is a field and the identity maps to the identity, the homomorphism is one- 
one and the image is a field. Thus P.” is a prime ideal, the order of Z(I'(&))/P/” 
is p/, and and P’ is the complete inverse image of P,’. Since the ideals P/ can 
be recovered from the P,’ and since the P! are distinct, the P/’ are distinct. 

The next step is to compare the ideals []?_, P,’ and (p)R. We shall use the 
fact that the polynomial Pees F;(X)% — F(X) in Z[X] has coefficients divisible 
by p and therefore lies in pZ[X]. The computation 


& & 
i Pe I (pR + F,(€)R)% 


g 
C pR+ [| FiG)"R 
i=i 

g 


C PR+([] Fi’ —F)(E) since FE) =0 
i=l 


Cc pR+ pZ(1)) since its Fi (X)% — F(X) lies in pZ[X] 


shows that []7_, P- © (p)R. If we can show that N([]%_, P’) = N((p)R), 
then Proposition 5.4b will allow us to conclude that []#_, P’ = (p)R. 
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At this point let us specialize to the case that Z(I'(€)) = R and see how to 
complete the proof. Under this assumption the definitions of P; and P,’ exactly 
match. What we have shown about the P.” thus says that the P; are distinct prime 
ideals in R with |R/P;| = p/',hence with dimp, (R/P;) = fi. Use of Proposition 
5.4 and the fact that |Z(I' (€))/P/’| = p# gives N(T]f_, P”) = TTL, N(P)* = 
TI, pi = p&i=efi = p”, the last equality holding because deg F(X) = 
ee e; deg F(X). Since p” equals N ((p)R), the desired equality of norms has 
been proved. This completes the proof of the theorem when Z(T'(é)) = R. 


We interrupt the general proof for the promised lemma. When we apply 
the lemma to finish the proof of Theorem 5.6, we shall take A = Z(T'(é)), 
J = J(&), and m = p. The hypotheses of Theorem 5.6 show that the condition 
GCD(p, J(&)) = 1 is satisfied. 


Lemma 5.7. Suppose that A is an additive subgroup of finite index J in R and 
that m > 1 is an integer relatively prime to J. Then for eachr ¢€ R, there exists 
aéAwithr—ainmR. 


PROOF. Let {u1,..., Un} be a Z basis of R, and let {v,,..., vp} be a Z basis of 
A. We can write vj = yy cu; for an integer matrix [c;;] with | det[c;;]| = J. 
Let r = )~"_, bju; be given, and let the unknown a € A be expanded as a = 
1 ajv;- Then a = >i; 4/cij4i, and we are to arrange that the element 


r-a= 3 (b; — > cjj4j) Mi 


i=l j=l 


il 


is in mR. Thus we are to arrange that each coefficient of a u; is divisible by m. 
Since | det[c;;]| = J is relatively prime to m, the system of linear equations 


n 
> cjjaj = b; mod m 
j=l 


with unknowns a,,..., a, has a nonsingular coefficient matrix modulo m and 
therefore has a solution. 


SECOND PART OF PROOF OF THEOREM 5.6. The ring homomorphism of Z(T'(&)) 
into R/(pR + F;(&€)R) given by the composition of the inclusion followed by the 
quotient map descends to a ring homomorphism 


ZT (E))/(pZWCE)) + Fi(E)ZC(E))) — R/(PR+ FER). (*) 


To see that (*) is onto, letr € R be given. Take A = pR in Lemma 5.7. Choose 
z € Z(T (&)) by the lemma in such a way that z —r is in pR. Under the mapping 
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(«), the coset of z goes tor +(z—r) +pR+F)(E)R =r+pR+F;(E)R, which 
is the coset of r. Hence (+) is onto. 

To see that () is one-one, suppose that z maps to the 0 coset in the image. 
Then z = pri + Fi(€)r2 withr; andr2 in R. Lemma 5.7 produces z» in Z(I (&)) 
with rz — zz in pR. Hence the decomposition z = pr; + F;(&)(r2 — Z2) + Fj (€)z2 
exhibits z asin pR + F;(€)Z(1(é)). The product F;(€)Z(1(€)) is in Z(T'(&)), 
since Z(T'(€)) is aring, and () will be one-one if we show that pRN Z(I'(&)) € 
pZ(T(&)). Let {u;} be a Z basis of R, let {v;} be a Z basis of Z(I'(&)), and 
write vj = 0, c;ju; for integers c;j. If z’ is in pR NM Z(I'()), let us write 
z’ = )),a;v;. Substitution gives z’ = 57, (>), ajcij)ui. Since z’ is in pR, we 
see that 3 {Cid = 0 mod p for alli. The determinant of [c;;] is the index J(&), 
up to sign, and this by assumption is not divisible by p. Therefore a; = 0 mod p 
for all j, and it follows that z’ is in pZ(T(é)). Hence (*) is one-one. 

We have thus proved that () is a ring isomorphism, i.e., that Z(T(€))/P” = 
R/P; for all i. The left side is a field, and hence P; is a prime ideal. From 
the isomorphism we obtain N(P;) = |Z(P(é))/P/’| = pl. The computation 
N(T, 2) = TW NCP = TT, pif = peat = p” in the last 
paragraph of the first part of the proof is now fully justified, and we can therefore 
conclude as in the special case that []/_, P’ = (p)R. 

Finally we have to prove that the ideals P; are distinct. If indices i # j are 
given, we know that P;” # P’’. Choose z in P,’ but not P””. Then z is in P; 
because P,’ C P;, and z is not in P; because the proof above that (*) is one-one 
showed that Z(T'(&)) N Pj © PY. This completes the proof of Theorem 5.6. 


PROOF OF THEOREM 5.5 WHEN p IS NOT A COMMON INDEX DIVISOR. If p is not 
a common index divisor, we can choose a primitive € for K/Q such that é is in 
R and p does not divide J(€) = |R/Z(I'(é))|. Let F(X) be the field polynomial 
of € over Q. Since D(é) = J(€)*Dx, p divides Dx if and only if p divides 
D(&). Thus p divides Dx if and only if p divides the discriminant of F(X). 
This happens if and only if the discriminant of F(X) is = 0 mod p, if and only 
if F(X) has a root of multiplicity > 1 in an algebraic closure of F,,, if and only if 
the factorization over F,, of F(X) as a product of powers of distinct irreducible 
monic polynomials has some factor with exponent > 1. Applying Theorem 5.6, 
we see that this last condition is satisfied if and only if the unique factorization 
of the ideal (p)R in R as i=} Pe has some e; > 1. 


As was mentioned earlier in this section, the difficulty in proving Theorem 5.5 
in complete generality is that we lack sufficient tools for addressing questions by 
localization. The different prime numbers are interacting in some fashion, and the 
above proofs were unable to separate them. The usual technique of localization 
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in our situation!® suggests enlarging one or the other of the rings Z and R by 
adjoining inverses for all elements not in some prime ideal of interest. Then we 
piece together the results. If the localizing is done with respect to a prime ideal 
(p) of Z, then Z gets replaced by the subring S~'Z of all members of Q with no 
factors of p in the denominators, and R gets replaced by S~'R. One advantage 
of this procedure is that S~! R is a principal ideal domain, whereas R is typically 
not such a domain. 

Localization in that formulation does not by itself reveal a clear path to a proof 
of Theorem 5.5. Two additional ideas enter the argument to make a path seem 
natural; Dedekind succeeded without the second of them, and historically it is 
only with hindsight that one sees the benefit of the second idea. The first idea is 
to use a more fundamental object than the discriminant of K, called the “relative 
different” of K/Q; this makes it possible to aim for a more precise description 
of the ramification indices when they are not equal to 1. The second idea is due 
to K. Hensel and involves forming a kind of completion of the localized rings; 
the ring Z gets replaced by the ring Z, of “p-adic integers,” and the field Q 
gets replaced by the field Q, of “p-adic numbers.” We return to these ideas in 
Chapter VI. 
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In treating examples of cubic fields, it will be convenient to have one further 
tool available for computing discriminants. Let K be a number field, let € be 
a primitive element of K/Q, and let F(X) be its field polynomial over Q. Let 
&; = 0;(&) be the conjugates of €, and assume that | = €. The conjugates are 
the roots of F(X) in C, and hence 


F(X) =|] (x -&). 
i=l 
The derivative is F’(X) = }°_, [])4; (X — &), and therefore 
Fé) =|] -&). 
j=2 


Observe that the form of the left side shows that this element lies in K, and it 
lies in R if € lies in R. The different D(é) of the element é is defined to be this 
element of K, namely!! 


10] ocalization was introduced in Section VIII.10 of Basic Algebra. 
'! The different of an element is related to the notion of relative different mentioned at the end of 
Section 3, but the nature of that relationship will not concern us at this time. 
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Dé) = Fé) =|] € -&). 
j=2 


Since F’(X) has coefficients in Q, the conjugates o;(F’(€)) of F’(é) are the 
elements F’(o;(€)) = F’(&;) for 1 < i <n. The formula for F’(X) shows that 
F'(§) = [lai (&; — &;). Therefore the norm of D(é) is 


Nxo(D)) = Ne oF’) =[] F’'&) =] ][[[@-& 
i=l i=l j#i 


= (yr PP TT] & -§y = (Hb De). 


i<j 


In other words, the norm of the different of € is, up to sign, equal to the discriminant 
of '(€), which in turn equals the discriminant of the field polynomial of the 
primitive element €. The definitions of D(é) and D(é) and the formula connecting 
them make sense if € is allowed to be any element of K, primitive or not. Both 
D(é) and D(é) have the property of being nonzero if and only if € is primitive. 


EXAMPLE. For the field K = Q(/2), the different of € = \/2 is aie 5 = 


3/4, and the discriminant of X? — 2, up to the sign (— 1)3?/2. is the norm of this, 
i.e., 


D(V2) =-BV4BV40)BV407), — where w = 77", 
SF?" 


Alternatively, the norm can be computed from a field polynomial. Specifically 
the norm of 34/4 is the determinant of left multiplication by this element when 
considered as a Q linear mapping of K into itself. 


We saw already in Example 2 of Section 2 that D(/2) = —3°27, but the 
earlier method of computation was longer. At the end of Section 2, we saw in 
addition that {1, De a/4 } is a Z basis of the ring of algebraic integers in the field 
K = Q(W/2). The use of differents does not simplify the proof of this latter fact. 

In this section we consider further examples of cubic extensions of Q. The 
first such fields that we study are the pure cubic extensions K = Q(%/m ), where 
m is any cube-free positive integer > 1. Already with these fields IK, we shall see 
that Dx is not necessarily equal to D(é) for some algebraic integer €. However, 
all these fields have no common index divisors. Then we examine Dedekind’s 
example of a cubic number field for which 2 is a common index divisor. 
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The correspondence of cube-free integers m > 1 to fields Q(%/m) is many- 
to-one: if m is given and p is a prime dividing m, let m’ = m/p if p* divides m 
and m’ = mp if p? does not divide m; then Q(2/m) = Q(</m’ ). In analyzing 
Q(X/m ), it will be convenient to normalize matters so as to resolve this ambiguity. 
We can write m uniquely as a product m = ab? for positive square-free integers 
a and b; these have GCD(a, b) = 1, b? is the largest square dividing m, and a is 
given by a = m/b*. Then m and m' = ab lead to the same field. 


Proposition 5.8. For a cube-free integer m > 1, let K = Q(</m), and let R 
be the ring of algebraic integers in K. Write m = ab? for positive square-free 
integers a and b with GCD(a, b) = 1, and define two members of R to be the 
real cube roots 0; = Vab2 and 62 = Jab. Then a Z basis of R consists of 

(a) {1, 01, 62} if a 4 +b mod 9, ie., if m is of Type I, 
(b) {7(1 + 6; + 62), 6;, 82} for exactly one choice of the pair of signs if 
a = +b mod 9, i.c., if m is of Type I. 


In the respective cases the field discriminant is given by 


D —21a*b* if m is of Type I, 
“| -3a2b? if m is of Type IL. 


REMARKS. More precisely in Type II, the congruence a = +b mod 9 implies 
that a and b are prime to 3. Choose signs s = +1 and tf = +1 such that 
sa = 1 mod 3 and tb = 1 mod 3. Then the first member of the Z basis is to be 
F(1 +50; + t62). The smallest m leading to Type Iis m = 2, and this case was 
examined in Example 2 in Section 2. The smallest m leading to Type Ilism = 10, 
and then the first member of the asserted Z basis of R is t(1 + /10 + 100). 


PROOF. Let w = e?7'/3, The conjugates of 9; can be taken to be o;(0;) = 6, 
02(61) = 6, and 03(61) = 76). Since 6? = b62, we have 0; (62) = b=10;(01), 
and therefore 01 (02) = 0), 02(02) = w*6>, and 03(02) = w6. In view of the 
formula before Proposition 5.2, D((1, @,, 02)) is the square of 


1 1 1 
det @ w6; wd) ; 
0 wO> wb 


and we calculate that D((1, 61, 02)) = —27a7b’. 

Let us apply Proposition 5.2 to the triple {1, 61, 02} of members of R. For each 
prime p dividing 27a7b’, we are to check whether certain elements are integral. 
First suppose that p divides a but p 4 3. It is enough to check the elements 
p (ao + 01) or p~! (ap + 416, + 62) for integrality when ap and a, are integers 
from 0 to p — 1. Form the extension L = K(3/p) = Qn, 3/p ) of K, and 
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let T be its ring of algebraic integers. The degree [IL : Q] equals 9 if L # K and 
equals 3 if L = K. If p (ao + 4) is integral, then ag + p'3((a/p)b?)'3 = pr 
withr € R, and hence ap = p!/3c withc € T. Applying Ni/q to both sides, we 
obtain ae = P? Nac) if L # K, and we obtain ae = pNx/go(c) ifL = K. In 
either case, p divides ag, and dg = 0. So p'6 is integral, in contradiction to 
the facts that the field polynomial for K of p~!0, is X? — p~3ab and that ab? 
contains p as a factor only once. We conclude that p~! (ag + 6) is not integral. 

Similarly if the element p~!(a) + a,4; + 62) is integral, then we see that 
ay + ap"? ((a/p)b*)'? + p?? ((a/p)?b)'? = pr with r € R. So ay = p'°c 
with c € T, and the same argument as above shows that adj = 0. Hence 
ay((a/p)b*)'? + p'?((a/p)’b)'? = p*?r, and a, ((a/p)b?)'? = p'c! with 
c’ € T. Taking the norm gives a} ((a/p)b?)? = P?Nio(c’) if L # K and 
a; (a/p)b* = pNx/g(c’) if L = K. Since a/p and b are prime to p, we conclude 
that p divides a; in both cases. Therefore a, = 0, and p~'6) is integral. The 
field polynomial for K of p~'6) is X* — p~a*b, and a*b contains p as a factor 
only twice. We conclude that p (ao + 416; + 62) is not integral. 

This disposes of the prime divisors of a other than p = 3, and we handle 
the prime divisors of b other than p = 3 in the same way, except that we start 
from the ordered triple (1, 62, 0;) and therefore need check only p (ao + 42) 
and p'(ao + a,62 + 41). 

Now let us apply Proposition 5.2 to the ordered triple (1, 61, 62) for the prime 
p = 3, except that we allow coefficients 0 and +1 instead of 0, 1,2. We check 
integrality for the elements 4(1 + 61), +(1 + 42), (0) + 2), and 4(1 + 6) + 0) 
by checking whether the coefficients of their field polynomials are in Z. For the 
first two, let p be +0, or £62. The coefficient of the first-degree term in the field 
polynomial of $(1 + ¢) is 3 times 


(1+ 9) +og)+ (1+ 9) +0*¢) + (1+ o9)(1+ 09) 
= (1+ 9)2+op +0) + (14+ a9)(1 +09) 
=(1+ 9)2-9)+(1-9+9’?)=2+9-@4+1-94+ 9 =3, 


hence is i. This is not an integer, and thus F(1 + ) is notin R. If g = +6, and 


w = +6), then the corresponding computation for g + y is 


(9+ W)(op tah) + (G+ W)(o'¢ +o) + (og +0 P)(o’¢ + of) 
~+W@etwW4+@-ovt+w’) 
= —3p~ = —3ab(sgn ¢) (sgn), (x) 


and 5 of this is an integer only if 3 divides ab. In this case our hypotheses show 
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that 9 does not divide ab. The constant term in the field polynomial of i(p +Wy) 
is —4 times 
V+ Wlogt oor top=ep+y 
= (sgn y)ab” + (sgn w)a’b 
= ab(bsgng +asgny). () 


When 3 divides ab exactly once, 3 divides (+**) exactly once, and hence + of 
(+k) is not an integer. Thus i(y + w) is notin R. 
It remains to check z(1 +9+w) with g = +6, and y = +6). The coefficient 
of the second-degree term in the field polynomial of z(1 + 9+ w) is equal to 
+ Tr(1 + g + Ww) = —1 and is an integer; thus it imposes no restrictions. The 


first-degree term of the field polynomial is 5 of 


+¢+W+opt+ov)+1+¢9+Wt+o¢+ oy) 

+ (1+ op + op) + 09 + of) 
=(1+9+W2-9-Wt+d-e9-¥+e-9v+W) 
=3— 3p = 3(1 — ab(sgn ¢)(sgn y)), (t+) 

and 5 of (+) is an integer if and only if ab = (sgn g)(sgn y) mod 3. In particular, 
the proof is now complete unless ab = (sgng)(sgn yy) mod 3. Thus we may 


assume from now on that neither a nor b is divisible by 3. 
The constant term of the field polynomial of (1 +o+%) is — x times 


1+9+Wd+op+ory) + 0’¢ + oy) 
=1+4+Trxe(y + w) + (&) + (#*) 
1+0— 3ab(sgn ¢)(sgn yp) + ab(bsgng + asgny). 


Puta = asgng and 6B = bsgny, so that 1 — 3a68 + aB(a + B) is to be divisible 
by 27. Since neither 6 nor @ is divisible by 3, we can define / mod 27 by the 
congruence 6 = /a mod 27. Substituting shows that 1 3loa? + la?(a +la) = 
0 mod 27, hence that /(/ +4 1a? = 3la2 — 1 mod 27, which we can rewrite as 


atl? + (a3 — 3a”) + 1 = 0 mod 27. 
Completing the square in / allows us to write this congruence as 
(1+ 3(1 —3a7'))* = $(1 — 3a7')? — @? mod 27. 
Factoring the right side, we obtain 


(1+ 51 — 3a7!))* = ja “*[a(a — 1)*(a@ — 4)] mod 27. (+) 
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Ifa = 1 mod 3, the expression in square brackets on the right side is = 0 mod 27, 
and 0 is the square of 0 and +9. If @ = 2 mod 3, then the expression in square 
brackets is a square if and only if a(@ — 4) = c? mod 27. Considering the 
congruence only modulo 3 gives 2(—2) = c* mod 3 and therefore c? = 2 mod 3, 
which has no solutions. Thus w = 2 mod 3 leads to no solutions of ({+). We can 
summarize by saying that the solutions of (++) are given by w = | mod 3 and 


1+ (1 —3a7!) =0 mod 9. 


One checks that the values a = 1,4, 7 mod 9 all lead to/ = 1. 
Let us summarize. Let s and ¢ be signs +. Then 5(1 + 50, + t02) is integral 
if and only if both of the following conditions are satisfied: 
(i) sa = tb =1 mod 3, 
(ii) sa = tb mod 9. 
When these conditions are satisfied, we are in Type II; otherwise we are in Type I. 
This completes the proof. 


In the setting of Type I in Proposition 5.8, let us form the discriminants of 
T(6;) = (1, 6, 67) and '(@2) = (1, 2, 03). Using the method of computation 
at the beginning of this section, we see that the differents in the two cases are 
307 and 303. Therefore the discriminant of ['(6;) is D(@:) = —Nx/9(367) = 


—33(6?)3 = —33(ab*)* = —3a7b*, and the discriminant of I (0) similarly is 
D(6>) = —3°a*b?. The absolute value of the greatest common divisor of these 
two expressions is 3°a*b* = |Dx|, and therefore there are never any common 


index divisors in Type I. 

On the other hand, there exist situations in Type I in which no primitive element 
E of Q(</m ) lying in R has ['(&) as a Z basis. To prove this fact, we make use 
of the following proposition. 


Proposition 5.9. For a pure cubic extension K = Q(Vab?) of Type I, an 
element € = x + yO; + 202 with Z coefficients has D(é) = Dx if and only if 
yb — 2a =+H1. 


PROOF. The matrix whose determinant is D(I(&)) is given by 
3 Tr(é) Tr(é?) 
M= ( Tr(é)  Tr(é?) me) : 
Tr(€?) Tr(€*) Tr(é*) 


where Tr is short for Trxg. The element CH os has conjugates oH ja ti OH : 
and w”tJ9'6) , where @ = e77'/3. Thus 


Tr(6i6J) = 1 tol + wt oiof = 1 + oft + wD) 9! 6). 
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This is 0 if i + 27 is not divisible by 3 and is 30/04 otherwise. We compute the 
trace of each power of & by applying the formula 


THE!) = Y> (!)x!- TH((y8) + 264)4), 
k=0 


which comes from treating € as a binomial. The traces of the powers of y@; + z02 
work out to be 


+ Tr(y01 + 202) =0, 
3 Tt((y1 + 262)”) = 2yz01 = ab(2yz), 
5 Tr((yO) + 202)°) = ab(y*b + 2a), 
5 Tr((yO1 + 262)*) = (ab)*6y"2?. 


Substituting, we find the following formulas for the trace of each power of &: 
3 Ir(é) = x, 
5 Tr(§*) = x* + 2(ab)yz, 
4 Tr(é*) = x° + 3x(ab)2yz + (ab)(y*b + za), 
4 Tr(*) = x* + 6x*(ab)2yz + 4x (ab)(y°b + z°a) + (ab)’6 yz’. 


The matrix M is therefore of the form 


1 x AOA 
su =( x x7+A neva), 


x7+A +B x*4+C 


where 
A = 2(ab)yz, 
B = 3x(ab)2yz + (ab)(y*b + 23a), 
C= 6x*(ab)2yz + 4x (ab)(y*b +2a)+ (ab)*6y7z?. 
Expansion of det iM results in an expression that simplifies to 
det 3M = AC + 2xAB — 3x*A? — A’ — B’. 


Thus we have only to substitute. The resulting expression simplifies greatly, and 
we obtain det iM = —(ab)*(y*b — z3a)*. Consequently 


D(é) = —3°(ab)*(y*b — za)’. 


Since Proposition 5.8 has shown that Dx = —33(ab)*, the result follows. 
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Thus in order to give an example of an m for which no € has D(é) = Dx, we 
have only to select a and b for which the Diophantine equation y°b — z3a = 1 
in y, z has no solution. Choose a = 7 and b = 5, so that m = ab? = 175. To 
verify that the Diophantine equation has no solution, take the equation modulo 
7 and then modulo 5, obtaining 5y> = 1 mod 7 and —7z* = 1 mod 5. These 
congruences say that y> = 3 mod 7 and z? = 2 mod 5. The only cubes modulo 
7 are +1, and thus the congruence for y has no solution. 

We turn to the question of the splitting of prime ideals in pure cubic extensions 
K = Q(4/m). In the notation of Proposition 5.8, we again write m = ab’, and 
we shall assume that the extension is of Type I. We saw in Proposition 5.8 and 
the remarks afterward that Dx equals the greatest common divisor of D( Jab? ) 
and D(/ab ). Therefore the splitting of every prime ideal (p) in Z is described 
by Theorem 5.6. We have only to sort out the details. 


Proposition 5.10. Let K = Q(</m) be a pure cubic extension of Type I, and 
let R be its ring of algebraic integers. If p is a prime number, then the ideal (p)R 
of R splits into prime ideals as follows: 

(a) (p)R = P,P) with N(P,) = p and N(P) = p* if p = —1 mod 3 and 
Pp does not divide Dx, 
(b) (p)R = P,P) P3 with P;, Py, P3 distinct of norm p if p = | mod 3, 
x? =m mod pis solvable in F,,, and p does not divide Dx, 
(c) (p)R is prime of norm p? if p = 1 mod 3, x* = m mod pis not solvable 
in F,, and p does not divide Dx, 
(d) (p)R = P? with N(P) = p if p divides Dx. 


PROOF. The prime divisors of Dx are 3 and the prime divisors of a and b. 
For all other primes Theorem 5.6 shows that all ramification indices are 1. Let 
p be a prime of the form 6k + 1 not dividing Dx. The multiplicative group F° 
of F,, is cyclic of order p — 1 and hence has order divisible by 3 if and only if 
p= "6k +1. Thus there are three cube roots of 1 when p = 6k + 1 but only 1 


when p = 6k — I. In the latter case the cubing map is one-one onto from FF 


to itself. Thus X? — m factors modulo p as the product of a first-degree factor 
and an irreducible second-degree factor if p = 6k — 1, and (a) follows for such 
primes from Theorem 5.6. If p = 6k + 1, then X? — m either factors modulo p 
as the product of three first-degree factors or is irreducible, since 1 has three cube 
roots. Thus (b) and (c) follow for such primes from Theorem 5.6. 

For p = 2 if m is odd, then X* —m = X?7—1 = (X —1)(X?+-X +1) mod 2, 
and we are in the situation of (a). This completes the discussion of primes that 
do not divide Dx. If p divides m, then X? — m = X? mod p is the cube of a 
first-degree factor, and (d) follows in these cases. For p = 3 whether or not p 
divides m, we have X*? — m = X* — m? = (X — m)? mod 3, and (d) follows in 
this case. 
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We conclude this section by discussing Dedekind’s example of a common 
index divisor. The field in question is again of degree 3 over Q but is not of the 
form Q(</m ). Instead, the field is K = Q(&), where & is a root of F(X) = 
X34 X?—2X +8. The polynomial F(X) is irreducible over Q because Gauss’s 
Lemma shows that its only possible linear factors are X — k with k dividing 8 
and because routine computation rules out each such linear factor. As usual, let 
R be the ring of algebraic integers in K. 

The different of € is D(é) = F’(E) = 3€* + 2& — 2, and the discriminant D(é) 
therefore is given by D(E) = —Nxg(3&* +2 —2). We calculate this norm as the 
determinant of left multiplication by 3&7 + 2 — 2 on K, using the ordered basis 
(1, &, €7). Since €? = —&? 4 2& —8 and &4 = —£3+42&* —8& = 3&* —10E +8, 
we have 


(3? + 2& — 2)(1) = —2 4 2E + 32”, 


(36? + 2& — 2)(€) = —2é + 26? + 36? = —24 + 4é — £7, 
(3&7 + 2 — 2)(E7) = —2£? + 267 + 364 = 8 — 268 + 5E?. 


Thus 


a9 94 8 
Nx/o(3* + 2& — 2) = det ( 2 4 -26) = 27.503, 
a. 2 5 


and D(é) = —2? - 503. Thus either the index J(€) of Z(T'(&)) in R is 1 with 
Dx = —2? - 503, or J(€) = 2 with Dx — 503. 

Problems 24—25 at the end of the chapter show that S(E 24 &) is in R and 
that consequently the correct choice is J(€) = 2 with Dk = —503 and with 
{1,é, 5(E7 + &)} as a Z basis of R. In fact, 2 divides J(7) for every primitive 
element of K lying in R, and therefore 2 is a common index divisor in the sense 
of Section 2. One way to check this assertion would be to calculate D(n) for 
every such 7. The computation would be feasible because we can express 7 as a 
Z linear combination of the members of {1, &, $(E * 4 &)} and calculate the field 
polynomial of 7 in the same way that Nx/q(&) was calculated above. 

However, there is an easier way. Problem 28 at the end of the chapter shows 
that (2)R splits as the product of three distinct prime ideals of R. If there were 
some 7 for which 2 did not divide J (7), then Theorem 5.6 would show that the 
minimal polynomial of 7 when reduced modulo 2 splits as the product of three 
distinct first-degree factors. But IF, has only 2 elements, hence only two possible 
distinct linear factors to offer. Thus Theorem 5.6 must not be applicable to 7 and 
the prime 2, and we conclude that 2 divides J (7). Going over this argument, we 
see that we have established the following more general result. 
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Proposition 5.11. Let K/Q be a field extension of degree n, and let R be the 
ring of algebraic integers in K. If p is a prime number with 2 < p <n— 1 such 
that (p)R splits as the product of n distinct prime ideals of R, then p isa common 
index divisor for K. 


5. Dirichlet Unit Theorem 


Let K be a number field of degree n over Q, and let R be its ring of algebraic 
integers. We regard K as a subfield of C. The units of K are understood to 
be the members of the group R* of units of the ring R. As was observed in 
Section 2, there exist exactly n field mappings of K into C, and we denote them 
by o1,..., 0,3; one of these is the inclusion of K into C. If x is in K, then the 
images 01(X),..., On(x) are called the conjugates of x. 

In Section 1.6 we studied the group of units in the quadratic case n = 2, 
and we found, particularly in the problems at the end of that chapter, that an 
understanding of this group was essential to working successfully on the number- 
theoretic problems studied in that chapter. When n = 2, we found that the 
qualitative nature of the group R* depends on the sign of the field discriminant. 
The group turned out to be the finite subgroup of roots of unity in K if Dx < 0, 
and it turned out to be isomorphic to the product of a copy of Z and a cyclic group 
of order 2 if Dx > 0. The hard step in this analysis was constructing an element 
in the subgroup Z in the latter case. 

Because of the importance of R* in the quadratic case, we can expect that an 
understanding of R* for our general number field K is important for higher-degree 
number-theoretic questions. In this section we shall obtain a structure theorem 
for R* for general n analogous to the structure theorem form = 2 mentioned in 
the previous paragraph. Such a theorem may not answer all important questions 
about R*, but it will be a good start.2 The main theorem is Theorem 5.13 below, 
the Dirichlet Unit Theorem. 

The units of R are the members ¢ of R with Nxjg(€) = +1. This simple fact 
is verified for general IK in the same way that it was verified for quadratic K in 
Section 1.6. 

Any element ¢ of finite order in R* is a complex number with e* = 1 for 
some k and hence lies on the unit circle of C. Since such an element ¢ is a root 
of X* — 1, all its conjugates o;(e) lie on the unit circle of C. We shall prove the 
following proposition about these elements. 


!2For example, when n = 2, we defined the fundamental unit ©; for the case Dx > 0 to be the 
least unit > 1, and the sign of Nx/g(¢1) was a thorny question that we did not answer fully but that 
affected results in the problems at the end of the chapter. 
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Proposition 5.12. The subgroup of R* of elements of finite order consists of 
all / roots of unity in C, where / is an integer depending on K that is bounded 
when the degree n = [K : Q] is bounded. 


PROOF. We are to bound the integers k for which primitive k™ roots of unity 
occur in K. Let k have prime decomposition k = p{''--- p?”". From Section 
IX.9 of Basic Algebra, we know that the cyclotomic polynomial ©;(X) is a 
monic irreducible member of Z[X] whose roots in C are exactly all primitive k™ 
roots of unity; moreover, the degree of ©; (X) is given by the Euler g function: 

yk)=k J] (1-3). 
p divides k 
If primitive k™ roots of unity occur in K, then g(k) < n because ®;(X) is 
irreducible over Q, and hence (p; — 1)---(p, — 1) < n. Allowing p; = 2 
possibly, we see that each factor p; — 1 with j > 1 is at least 2, and thus 
2'-! <n. Sor is bounded as a function of n by log, 2n, and we obtain 


gk=k TT (1-5) =2°e ea #. 
first log, 2n 
primes 
Consequently k < 2ng(k) < 2n’, as required. If R* contains one primitive k" 
root of unity in C, then it contains them all, since the k™ roots of unity form a 
cyclic group and any primitive such root is a generator. The result follows. 


We shall use the field mappings 0; : K > Cfor1 < j <n to introduce useful 
“absolute values” on IK. The mappings o; are of two types: 
(i) those carrying K into R, 
(ii) those carrying K into C but not into R; these come in pairs o and o, 
where o denotes the composition of o followed by complex conjugation. 
Suppose that there are r; mappings o; of the first kind and that there are r2 pairs 


of the second kind. Then r; + 2r2 = n. Renumbering 01, ..., On if necessary, 
let us arrange that o;,...,0,, are of the first kind, that o,,41,..., 0, are of the 
second kind, and that o,,4,,4; = 0,4; for 1 < i <r. We introduce r; + r2 


absolute values! on K by the definition 
IIxlls =los@)|  forl <s <ritnro, 


where | - | denotes the usual absolute value function on C. Then the function 
Log : KX > R"'*” given by 


Log(e) = (log |lell1, ..-, log lle |l-;475) 


'3These are called archimedean absolute values of K in the general theory. Some authors refer 
to them as archimedean valuations. 
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is evidently a group homomorphism. 

A lattice in a Euclidean space R’ is an additive subgroup Zu; ® ---@ Zu; such 
that {u1,..., u;} is linearly independent over R. Such a subgroup is discrete,!* 
and the quotient is compact, by the Heine—Borel Theorem. 


Theorem 5.13 (Dirichlet Unit Theorem). Let K be a number field of degree n 
with r; +72 absolute values, and let R be the ring of algebraic integers in K. The 
kernel of the restriction to R* of the function Log is the finite subgroup of roots 
of unity in K*, and the image of this restriction of Log is a lattice in the vector 
subspace of elements (x1, ..., X;,+4r,) in R"*” satisfying 


Xp tees $x, + 2X41 +++ + 2%,,4,, = 0. 


Consequently R* is a finitely generated abelian group of rank rj + r2 — 1. 


EXAMPLES. 


(1) The theorem reduces when n = 2 to results known from Chapter I. 
Specifically if K = Q(/m), then m > 0 makes r; = 2 and rp = 0, while 
m <Omakesr; = Oandr2 = 1. 


(2) For K = Q(/2), let w = e?7'/3. The field mappings of K into C carry K 
into R or Rw or Rw. Thus r; = 1 andro = 1. 


(3) The polynomial F(X) = X 5—5X+1 in Q[X] was studied as an example in 
connection with Galois theory in Section IX.11 of Basic Algebra. The polynomial 
was shown to be irreducible over Q and to have three real roots and one pair of 
complex conjugate roots. For K = Q[X]/(X> — 5X + 1), we therefore have 
r, = 3andr2 = 1. The primitive element € of K with > — 5 + 1 = 0 lies in 
R; it is a nontrivial example of a member of R* because &(€* — 5) = —1. 


The proof of Theorem 5.13 will occupy the remainder of this section. We 
begin by clarifying in Lemma 5.14 the relationship between discrete subgroups 
and lattices in Euclidean space and by proving in Proposition 5.15 a weak version 
of Theorem 5.13 that addresses everything except the existence questions. 


Lemma 5.14. A discrete subgroup of R’ is a free abelian group of rank < / 
and is necessarily of the form Zu; ®--- ®@ Zu» for some set {u1, ..., Um} that is 
linearly independent over R. The discrete subgroup is a lattice if and only if the 
rank is /. 


'4's discrete subset of R! is a subset S such that every one-point subset of S is open when S is 
given the relative topology. See Lemma 5.14 below for a converse assertion. 
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PROOF. We begin by proving that any discrete subgroup of R’ is topologi- 
cally closed. Let G be the subgroup, and choose by discreteness an open ball 
V={(c eR! | |x| < €} V about 0 with VM G = {0}. The open ball U = 
{x € R! | |x| < €/2} has the property that U + U C V. If G is not closed, let 
xo be a limit point of G that is not in G. Then the open ball x9 — U about xo 
must contain a member g of G, and g cannot equal xo. Write x» — u = g with 
u€U. Thenu = xo — g is a limit point of G that is not in G, and we can find 
g’ £1inG such that g’ isinu+U. Butu+U CU+U CV,ands0 g’ is in 
GN V = {0}, contradiction. We conclude that G contains all its limit points and 
is therefore closed. 

From the fact that any discrete subgroup G of R’ is closed, let us see that any 
bounded subset of G is finite. It is enough to see that the intersection X of G with 
any (finite-radius) closed ball is finite. The set X is closed because G is closed, 
and it is therefore compact by the Heine—Borel Theorem. By discreteness, find 
for each g € G an open ball U, centered at x that contains no member of G other 
than x. These open sets form an open cover of the compact set X, and a finite 
subcollection of them covers X. Each such open set contains only one member 
of X,and hence X is finite. 

Returning to the statement of the lemma, we induct on the dimension of the 
R linear span of the discrete subgroup, the base case being that the R linear span 
is 0. Let G be the discrete subgroup, and let {v,, ..., v,} in G be a maximal set 
that is linearly independent over R. Let Gp = GN eae, Rvj). By induction 
we may assume that every u € Go is a Z linear combination of v1, ..., v,—1. Let 
S be the set of R linear combinations of {v1,..., v,,} of the form 


0<c; <lforl<i<m-l, 


S = |v =ci $-++ + entm €G G2 


The set S is bounded, and we saw in the previous paragraph that any bounded 
subset of G is finite. So S is finite. Let v’ be a member of S with the smallest 
positive coefficient for um, say 


/ 
v= avy +--+ +anVm. 


If v is any member of S and its coefficient c,, is not a multiple of a,,, then v — jv’ 
for a suitable integer j has m"™ coefficient positive but less than a,,; by subtracting 
from v — jv’ a suitable Z linear combination v” of v1, ..., Un—1, we can make 
v — ju’ — v” be in S, and then we have a contradiction to the minimality of 
dm. We conclude that c,, is always a multiple of a,,. Then v — jv’ is in Go for 
some integer j, and it follows that the Z linear combinations of v1, ..., Un—1, V’ 
span G. This completes the induction and the proof of the first conclusion of the 
lemma. The second conclusion is an immediate consequence of the first. 
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For the remainder of the section, we adopt the notation in the statement of 
Theorem 5.13, and we shall not repeat it in the statement of every intermediate 
result. 


Proposition 5.15 (weak form of Dirichlet Unit Theorem). The kernel of the 
restriction to R* of Log is the finite subgroup of roots of unity in K*, and the 
image of this restriction of Log is a discrete additive subgroup in the vector 
subspace of elements (x1, ..., X;,+r,) in R"*” satisfying 


Xp tees $x, + 2%, 41 +++ + 2%,,4,, = 0. 


Consequently R* is a finitely generated abelian group of rank < rj + rz — 1. 


PROOF. For a in R*, we calculate that 


log lll) +--+ + log lla], + 2 log loll ,41 +--+ +2 log loll 4r 


= log (lo1(@)| +++ lor, (@) lor 41)? ++ Lor 419 (@)/*) 
= log| T] oj(a)| 
j=l 


= log |Nx/q(@)| = log 1 = 0. 


Hence the image lies in the vector subspace in the statement of the proposition. 

Fix a (large) positive number M, and consider the set Ey of all members a 
of R* for which all coordinates of Log(@) are < M in absolute value. Then the 
field polynomials 


det (XJ — (left by a)) = Tl (X —0;(a)) 
j=l 


of such elements a have all coefficients bounded by some M’ depending on M, 
since each |o;(a)| is of the form ||q||; and is < e“. Such a field polynomial is 
equal to g(X)’, where g(X) is the minimal polynomial of w and r is given by 
r deg(g(X)) = n. Since a is in R, the coefficients of g(X) are integers, and 
hence so are the coefficients of the corresponding field polynomial. There are 
only finitely many members of Z[X] of degree n whose coefficients are in a given 
bounded set, and hence there are only finitely many @’s in Ey. 

It follows that the image subgroup is discrete. Taking M = 0, we see also that 
the kernel of the restriction of Log to R™ is finite. Hence every element of this 
kernel has finite order and is therefore a root of unity. 
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We come to the proof of Theorem 5.13. For quadratic extensions of Q, which 
were handled in Section 1.6, the crucial question of existence was addressed by 
means of an approximation result (Lemma 1.15) for irrational numbers. That 
result did not immediately establish the existence of units of infinite order, but it 
was applied infinitely many times in the course of proving Proposition 1.16, and 
the total effect was to produce a unit of infinite order. 

We do something similar in general. In place of the approximation result 
in Lemma 1.15, we shall use a result known as the Minkowski Lattice-Point 
Theorem, which asserts the existence of lattice points in certain compact convex 
sets in Euclidean space. This result appears as Theorem 5.16 below. As was true 
in the quadratic case, it is not just a single application of this theorem that produces 
the desired units, but an infinite sequence of applications of it. The details will 
be more complicated here than in the quadratic case. Before describing how the 
argument is to proceed, let us establish the Minkowski theorem. 

Let {v),..., Um} be an R basis of R”, and let L = Zu, ®--- 6 Zu», be the 
corresponding lattice. The fundamental parallelotope for L corresponding to 
this basis is the set 


{ev +++: +CmUm | 0 < cj < 1 for 1 <j <m}. 


The volume of this fundamental parallelotope is independent of the choice of the 
Z basis for L. In fact, any two such Z bases are carried from one to the other by an 
integer matrix of determinant +1, and any linear transformation from R” to itself 
of determinant +1 is volume preserving. The one fundamental parallelotope is 
mapped to the other when the one basis is carried to the other, and hence the two 
fundamental parallelotopes have the same volume. 


Theorem 5.16 (Minkowski Lattice-Point Theorem).!> Let L be a lattice in 
RR” , and let Vo be the volume of a fundamental parallelotope. If E is any compact 
convex set in R” containing 0, closed under negatives, and having volume(E) > 
2” Vo, then E contains a nonzero point of L. 


REMARK. The constant 2” in the statement is best possible, as is shown by 
taking L to be the standard lattice and EF to be a cube oriented consistently with 
L, centered at 0, and having each side slightly less than 2. We need merely some 
constant, not the best possible one, in the application to Theorem 5.13, and the 
proof can be simplified a little for that purpose.'© But the present theorem will be 
applied again in the next section, and this time the best possible constant yields 
the most useful information. 


'The simple proof given here is due to H. Blichfeldt and is the standard one, so standard that 
Blichfeldt’s name is sometimes attached to the theorem. 

'6Tn particular, the final paragraph of the proof can be omitted, and we can fix a value of M 
proportional to s in making the argument. 
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PROOF. Without loss of generality, L is the standard lattice of points with all 
coordinates in Z, and Vo is 1. Fix an arbitrarily small positive constant €, and 
first assume that the given set E has volume(E) > (2+ €)”Vo. Arguing by 
contradiction, suppose that the only lattice point in E is 0. Since E is bounded, 
we can choose a number s > 0 in such a way that E is contained in the cube 
C; centered at 0, oriented consistently with the lattice, and having side 2s. Let 
us see that the sets / + SE for/ € L are disjoint. In fact, in obvious notation if 
ly + $e) = ly + Seo with ly A hy, then 1, — ly = }(e2 — e;), and this is in E 
because e7 and —e, are in E and E is convex. Thus the sets 7 + SE are indeed 
disjoint. 

Choose an integer M large enough to have s/M < e€. Any lattice point / whose 
coordinates are all < M in absolute value has / + SE S Cy 1s. Since the sets 


i+ SE for these /’s are disjoint, 


(2(M + 4s))”" = volume(Cy,1,) = > volume(! + 5) 
cs all JEL with 
all coordinates <M 


> (2M)"volume(;£) = M”volume(E), 


and therefore volume(£) < (2+s/M)”, in contradiction to our extra assumption 
that volume(E) > (2+ €)”. 

Now suppose that volume(E) = 2”. For each € > 0, let E. be the dilate 
d+ se)E . The sets E, satisfy the extra assumption made in the previous part of 
the proof, and therefore FE, contains a nonzero lattice point. Since E; is bounded, 
there are only finitely many possibilities for this nonzero lattice point for each 
€ < 1. Thus we can find a sequence of €’s tending to 0 for which this lattice point 
is the same. The convexity of the sets E,, in combination with the fact that the 
sets contain 0, implies that the sets are nested, and therefore this lattice point lies 
in E, for all € > 0. Since E is compact, E = (|...) E-, and therefore this lattice 
point lies in E. 


e>0 


Let us describe the lattice to be used when the Minkowski Lattice-Point The- 
orem is applied to obtain the Dirichlet Unit Theorem. Let Q be the real vector 
space Q = R" x C”? = R", and let |w|, be the magnitude of the s“* component 
of wm € Q for 1 < 5s <7, +72. We introduce a homomorphism ® of the additive 
group of K into the additive group of Q given by 


BO) So) pan f6n) Ga W)sasesOrn ee) 


for x € K. We shall be mostly interested in the restriction of ® to R, but the 
values on K will help a little with motivation when the Minkowski Lattice-Point 
Theorem is applied once again in the next section. Observe that our definitions 
make ||x ||; = |o;(x)| = |®(x)|, forx € Kand1 <s <r, +rp. 
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Lemma 5.17. The image ®(R) is a lattice in Q. 


PROOF. The homomorphism ® is one-one on R because 01, being a field map, 
is one-one. Since R is a free abelian group of rank n and ® is one-one, ®(R) is 
free abelian of rank n. Lemma 5.14 therefore shows that it is sufficient to show 
that ®(R) is discrete as an additive subgroup of &2. It is enough to show that a 
bounded region of Q contains only finitely many points of ®(R). 

The verification of this fact is similar to an argument in the proof of Proposition 
5.15: A bound by some M on all |o;(@)| for certain elements a € R implies that 
each field polynomial 


det (XJ — (left by a)) = Il (X —0;(a)) 
j=l 


has all its coefficients bounded by some M’ depending on M. These coefficients 
are integers when a is in R, and thus there are only finitely many such polynomials. 
Each polynomial has at most n distinct roots, and consequently only finitely many 
a’s satisfy such a bound. 


We are now ready to prove Theorem 5.13, but we precede the proof by an 
outline. The proof has three steps to it: 


(1) We apply the Minkowski Lattice-Point Theorem to the set ®(R) C Q, 
which we know is a lattice because of Lemma 5.17. For each sp with 1 < so < 
r} +12, let Es, be a set of w’s in Q defined by the conditions that |w|, is to be 
small for s 4 so and |q@|,, is allowed to be large—with the understanding that 
Es, 18 a bounded set and that F,, has volume > 2”Vo, where Vo is the volume 
of a fundamental parallelotope of ®(R). Using a nonzero lattice point in ®(R) 
obtained from applying Theorem 5.16 to E,, and squeezing E,, even more, we 
can obtain an infinite sequence of points a in R such that |NK/g(@)| remains 
bounded and such that the size of this norm is contributed to mostly by ||a||5,. 

(2) Applying the same argument that was used for quadratic extensions of Q in 
the proof of Proposition 1.16, we obtain infinite sequences of units whose norm 
is contributed to mostly by || - ||;,. We can do this for 1 < s9 <r) +12. 

(3) We pass to the Log map, proving and applying the following result from lin- 
ear algebra: a real square matrix [a;;] with the property that |a;;| > i i |a;;| for 
alli is nonsingular. In the application of this result, we have log ||€5, ||s, > 0 forthe 
so" constructed unit, log lls lls < 0 for s A so, and an equality that we can write 
either as )*"_, log |[&sq lly = O or as )""_, log |lesylls +2 Pee log |lEsylls = 0. 
If we drop all terms corresponding to the (7; +r>)" unit, then we are in a situation 
for which the result from linear algebra immediately implies the theorem. 
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PROOF OF THEOREM 5.13. The proof is carried out in three steps. 

Step 1. For fixed so with 1 < so < 71 +72, we construct an infinite sequence 
ae in R with 

(i) INx/g(@\)| < 2"Vo, 

(ii) [lees |; tends to 0 for each s € so as j tends to infinity, 

(iii) llons" lls tends to infinity as j tends to infinity. 
For the construction, form for each j > 0 the compact convex set in Q closed 
under multiplication by —1 consisting of all @ such that 


lols <j! for s £ so, 
Qe NI ge YG if 1 <s9 <r, 
lls = | (QE GO 22, ge TEV Gye ifry +1 <s9 <ry tro. 
This set has volume 
py 2 2g eV ay? = 2 Vo ifso <r, 
(27°) ej-7)2-!  (2" j?-22-' 2-2 Vo) = 2” Vo if so > 11. 


Theorem 5.16 shows that the set contains a nonzero lattice point a, Let us 
check that this point satisfies (i), (ii), and (iii). For (i), we have 


FY rytr2 2 
INxo(@e)| = (TT hee? s)( TT tees is) 
; j=l s=rjt+1] 
s G Iyri One" 19 ig Vo) j 2r2 if 59 <r) 
os Gj Wn G 2)r2 uF Ag i 29 aera "2Vo) if so Sry 
= 2"Vy2-"' 9” 
< De Vo. 


Property (ii) is immediate from the inequality oes? IIs < j~' for s ~ so. For 
(iii), we have , 


rytr2 


s 2 
(ea eed be 


rl 
1 < |Nxq(o))| = (TT ley Ils) ( 
j=l s=r,+1 
thus (ii) implies (iii). 
Step 2. For fixed so with 1 < so <r, +72, we construct an infinite sequence 


of units i such that 


(ii’) ley ||; tends to 0 for each s $ so as j tends to infinity, 


(iii’) lle IIs) tends to infinity as 7 tends to infinity. 
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For the construction, we pass to a subsequence from Step 1, still denoting it by 
ae? , such that Nxjo (a; 50) j is a constant integer, say M. Since R/(M) is finite, 
we can pass to a further subsequence, still with no change in notation, such that 
all q lie in the same residue class!” modulo the principal ideal (M) of R. Put 


ee a, fa, (so) 


Then Nxo(a;”) = Ngai), since Nxjo(a” ) is a constant integer, and 


a 5 (ot) — a? ) is in R, since all on?) lie in the same residue class modulo (M). 
The Computation 


(so) __ a op 5) — 680) 


(so) J Jj 1 (so) 
err = 14 =1¢4 If a (ay ) 
J ae” M eH 


(so) ; 


shows that €;°" is an algebraic integer. Hence it is in R. We certainly have 


Nxjgay) _ M 


Nxo(a’) M 


Nxjo(e?) = = 


) 


Therefore ey” is a unit. Also, the computation 


(so) 
je], = Haj lls 
la, 


shows that (ii) and (iii) in Step 1 im ii’) and (iii’) here. 
h hat Gay and Gil) in Step Mmply Gin and Gil hs 
Step 3. For each sg with 1 < so <r, +72, choose j large enough for the unit 
ef) — e in Step 2 to satisfy 
ii”) Je ||, < Lifs # 50, 
Git”) Je [Is > 1. 


We assert that the vectors Log(e) for 1 < so < ry; +12 — 1 are linearly 
independent over R. Hence Log(R*) has rank > r; + r2 — 1, and Proposition 
5.15 therefore implies that Log(R*) has rank equal tor; + r2 — 1. 

To verify this assertion, form the square matrix [a;;] of size r; +12 given by 


ena ifleypen, 
‘| 2log el, 9 ifr ti <j<rntn. 


'7This conclusion uses a result known as the Dirichlet pigeonhole principle or the Dirichlet 
box principle. 
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Then ajj > 0 for each i by (iii), aij < O fori  j by ii”), and a aij = 0 for 
each i because Nxjo(e) = |. Let [b;;] be the upper left block of [a;;] of size 
rj +r —1. For each i, we then have b;; > O and sj with j4i |bij| < bj. Let 
us prove that the matrix [b;;] is nonsingular. Assuming the contrary, let [c;] be a 
nonzero column vector with 


dV bijej =0 forall. (*) 
J 


If ig is an index such that |c;,| > |c;| for all 7, then setting i = ig leads to the 
strict inequality 


eb =e Pa Sle Ong l= ei l= | 2 base 
jFio Ji#io Ji#io 


’ 


which contradicts (*). Thus [b;;] is nonsingular. 

We conclude that [b;;] has rank r; + r2 — 1. Thus its rows are linearly 
independent, and the first r; + rp — 1 rows of [a;;] must be linearly independent. 
Therefore the vectors 


(Jog Je II, ..., log Je |[-,, 2log ie II,41, ---, 2]og le I+), 


indexed by so for 1 < so <r; +r2—1, are linearly independent in Rt”. In other 
words, the vectors Log(e) are linearly independent for 1 < so <7ry +72. —1. 


6. Finiteness of the Class Number 


As in Section 5, let K be a number field of degree n over Q, and let R be its ring 
of algebraic integers. Let o1,..., 0, be the distinct field maps of K into C, and 
assume that the first r, of them have image in R and the remaining ones come in 
conjugate pairs with 0,,4,,44 = Or,44 forl <k <r. 

As in Section I.7, where we treated the case of quadratic extensions, we define 
two nonzero ideals J and J of R to be equivalent if (r)J = (s)J for suitable 
nonzero elements r and s of R. The same argument as given in that section 
shows that the result is an equivalence relation. The principal ideals form a single 
equivalence class.'® 


'8Section I.7 worked also with a notion of strict equivalence of ideals, but we shall not attempt 
to extend strict equivalence to the present setting. 
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Proposition 5.18. Multiplication of nonzero ideals in R descends to a well- 
defined multiplication of equivalence classes of ideals, and the resulting multi- 
plication makes the set of equivalence classes into an abelian group. The identity 
element of this group is the class of principal ideals. 


REMARKS. The proofs of this result and of Theorem 5.19 below will use the 
following fact proved in Problems 48-53 of Chapter VIII of Basic Algebra: if I 
is any nonzero ideal in R and if I~! is defined by I~! = {x € K| xI C R}, then 
I~'I = R and there exists r € R with r/~! equal to an ideal of R. This fact can 
be made to look more beautiful by introducing the notion of “fractional ideal,” 
but we shall not carry out that step at this time.!° 


PROOF. If J is a nonzero ideal, let [/] denote its equivalence class, and define 
[1][J] = [J]. Suppose that (r)7 = (s)I' exhibits an equivalence. Then the 
equality (s)I’/J = (r)IJ shows that [//J] = [JJ]. A similar argument applies 
in the J variable, and therefore multiplication of classes is well defined. It is 
immediate that multiplication of classes is associative and commutative and also 
that the class of principal ideals is an identity. If a class [/] is given, let J~! be 
as in the remarks above, and choose a nonzero r € R such thatr/—! = J is an 
ideal in R. Multiplying by J gives (r) = rU7!'1) = (rI~!)I = JI, and thus 
[J][/] is the class of the principal ideals. So [/] has an inverse. 


The group of equivalence classes of nonzero ideals as in Proposition 5.18 is 
called the ideal class group of K. Its order is called the class number of K and 
will be denoted by hx. The main theorem of this section is as follows. 


Theorem 5.19. The class number f/x of any number field is finite. 


As we shall see in a moment, it is not too difficult at this stage to prove this 
finiteness. However, ix is an important invariant of a number field that determines 
whether R is a principal ideal domain, that occurs in various limit formulas in 
the subject, and that occurs also in dimension formulas connected with “Hilbert 
class fields.” It is therefore of considerable interest to be able to compute hx in 
specific examples. For quadratic fields this computation can be carried out by 
the techniques of Chapter I because of the close connection between ideal classes 
and proper equivalence classes of binary quadratic forms. But no comparable 
theory is available as an aid in computation for number fields of degree greater 
than 2. As we shall see, the relatively easy proof of Theorem 5.19 that we give 
in a moment does not offer any helpful clues about the value of hx. The main 


!°The result of the beautification is that the fractional ideals form a group generated by the ideals, 
and the group of equivalence classes is a homomorphic image of the group of fractional ideals. 
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task of this section will therefore be to provide a better proof of Theorem 5.19 
that helps us find the value of hx in specific examples. 

The two proofs have the following lemma in common. The lemma eliminates 
the notion of equivalence of ideals from the investigation and shows that the 
problem is really that of finding elements in each ideal of relatively small norm. 


Lemma 5.20. For a particular number field K, if there exists a real constant C 
with the property that each nonzero ideal J of R contains an element s 4 0 with 


INKjg(s)| < CNV), 


then each equivalence class of ideals contains a member L whose absolute norm 
satisfies N(L) < C. Consequently the class number /x is at most the number of 
nonzero ideals J in R with N(/) < C. This is a finite number. 


PROOF. Let a nonzero ideal J in R be given. By the remarks with Proposition 
5.18, choose a nonzero element r in R and an ideal J such that rJ~! = J. 
Multiplication by / and use of the remarks shows that (r) = JJ. By hypothesis 
for the lemma, choose a nonzero s € J with |Nx/g(s)| < C N(J). Since s is in 
J, (s) is contained in J, and therefore (s) = JL for some ideal L. Multiplying 
both sides of (") = JI by L gives (r)L = LJI = (s)I, and L is therefore 
equivalent to 7. Applying Proposition 5.4, we obtain N(J)N(L) = N(JL) = 
N((s)) = |Nxyo(s)| < C N(J). Therefore N(L) < C as required. 

Let us now count the ideals J with N(J) < C. In terms of the unique 
factorization J = oe P* of I, we have N(I) > Tes p;', where p; is the 
prime number such that P; 1 Z = (p;). In each case, N(P;) > p;. There are 
only finitely many primes p with p < C, each is associated with only finitely 
many prime ideals P of R with P 1 Z = (p), and P® contributes at least 2° 
toward N (J). The inequality N (7) < C shows that these p’s and their associated 
P’s are the only possible contributors to J and that each exponent is bounded by 
log, N(/). Hence there are only finitely many possibilities for /. 


Here is the relatively easy proof of Theorem 5.19. 


FIRST PROOF OF THEOREM 5.19. Let x1, ...,X, be a Z basis of R, and express 
members of R in terms of this basis as r = )°/_, c;x; with all c; € Z. The 
value of Nx g(r) is the value of the determinant of left multiplication by r on 
K, and this value, as a function of cj, ..., C,, is a homogeneous polynomial of 
degree n. Consequently we can find a constant C such that |Nx 7o( yoy ciXi) | < 
C max} <j<n |ci|". 

It is enough to show that the condition of Lemma 5.20 is satisfied for this C. 
Thus let an ideal J be given. As each c; runs through the integers from 0 to 
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N(J)!/", we obtain more than N(J) members r = Sy c;x; of R. Since there 
are only N(J) cosets modulo J, at least two of these members of r, say 7; and 
r, must lie in the same coset.2° Then r; — rp is a nonzero member of J, it has all 
coefficients between —N(J)!/" and +N(J)!/", and our construction of C forces 
|Nicq(rt —12)| < C(N(J)"")" = C NCJ). 


The second proof of Theorem 5.19 is to combine Lemma 5.20 with the deeper 
and more quantitative estimate given in the following theorem. 


Theorem 5.21 (Minkowski). For any number field K of degree n, each nonzero 
ideal J of R contains an element s 4 0 with 


4\" vn! 
INx/o(s)| < (=) * |Dxl!?NW). 
a n 


Here r2 is half the number of nonreal embeddings of K in C, and Dx is the field 
discriminant. Therefore every equivalence class of ideals contains a member L 
whose absolute norm satisfies 


4\" n! 
nity = (=) — |Dx|"?. 
us n 


We shall prove Theorem 5.21 shortly by applying Minkowski’s Lattice-Point 
Theorem to the lattice ®(J) in Q = R™ x C”, where ® is the mapping described 
after the proof of Theorem 5.16. The particular compact convex set in the 
application takes some time to describe, and we return to that matter shortly. 

Meanwhile, let us see a little of the utility of Theorem 5.21. The techniques of 
Chapter I are more useful for computing class numbers for n = 2 than Theorem 
5.21 is, and we therefore consider only n > 3. For n = 3, we must have 
ry < 1. Theorem 5.21 shows that every equivalence class of ideals in R has a 
representative L with 


4 3! 8 
N(L) < = 5 |Dxl|'? = — |Dxl'” < (0.283) |Dx|'”. 
3 On 


Problems 1-2 at the end of the chapter give examples of cubic extensions of Q 
whose discriminants are —23, —31, and —44. Since these have (0.283)|Dx|!/* < 
(0.283)7 < 2, the representative ideal in each case must have norm | and must 
be R. Thus for all three of these cubic fields, R is a principal ideal domain. 


20 Again we are applying the Dirichlet pigeonhole principle. 
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For the cubic field K = Q(</2 ), we know from Section 2 that the discriminant 
is Dx = —108. Consequently the estimate shows that every class of ideals has 
a representative with norm < 2. If an ideal J has N(J) = 2, then 2 has to be a 
member, and J divides (2)R. Proposition 5.10d shows that the factorization of 
(2)R is as P? for a certain unique prime ideal P. Thus R and P represent all 
equivalence classes, and hx is | or 2. If there is some r € R with Nx/g(r) = 2, 
then P = (r), and the class number is 1; otherwise it is 2. The element </2 has 
INx/o(V2)| = 2,and thus P = (./2). Therefore R is a principal ideal domain 
when K = Q(V/2). 

For Dedekind’s example, namely the cubic number field K built from 
X3 + X* — 2X + 8, we saw in Section 4 that the discriminant is Dk = —503. 
Then the constant in the estimate is < (0.283)/503 < 6.35. So the interest is in 
ideals of norm < 6. In ruling out ideals that are principal, we need consider only 
prime ideals with norm < 6. Problems 24—32 at the end of the chapter identify 
all the prime ideals of this form and show that they are all principal ideals! We 
conclude that hx = 1,i.e., that the R in Dedekind’s example is a principal ideal 
domain. Not every cubic number field has class number 1, however; Problem 4 
gives an example. 

Before turning to the proof of Theorem 5.21, let us observe the following 
striking consequence. 


Corollary 5.22 (Minkowski). For any number field K of degree n, 


W\'2 n 
Dx|!/2 > (=) Zs, 
PR 2G) on 
Therefore Dx > 1 ifn > 2, and there exists at least one prime number that 
ramifies in K. 


REMARKS. With a more general number field F than Q as base field, it can 
happen that no prime ideal ramifies in a certain nontrivial extension field K/F. 
See Problems 5-9 at the end of the chapter. 


PROOF. Set J = R in Theorem 5.21, so that N(J) = 1. The nonzero element 
s must have |Nx/g(s)| => 1. The theorem says that (4/7)'2(n!/n")|Dg|'/? > 1, 
and this is the displayed inequality of the corollary. Since rz < in, (1/4)? > 
(1 /4)"/?, and thus |Dg|!/? > 27-"1"/*n"/n!. Denote the right side of this 
inequality by a,. For n = 2, we have ag = w/2 > 1. Also, dn41/an = 
Sa + tyn > !/?, since (1 + ty is monotone increasing”! with n and is 
> 2forn = 2. Hence a, > 1 forall > 2. By Theorem 5.5 some prime number 
ramifies in K. 


21To see this monotonicity, expand a,4; = (1 + a" and a, = (1+ ayn by the Binomial 
Theorem, and observe that the asserted inequality holds term by term. 
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We turn to the proof of Theorem 5.21. We again make use of the map 
®:K—> Q=R" x C? = R’ of the previous section. Lemma 5.17 shows that 
®(R) is a lattice in Q, and our interest will be in the sublattice ®(/), J being the 
nonzero ideal under study. The idea is to consider the set of w € Q for which the 
function 

ry ry+r2 

N@) = (IL lel)(_ TT lel?) 

i=1 i=r\+1 
has N(w) < c, c being a positive number. Since N(®(x)) = |N/o(x)| for 
x € K, the question of finding a member s of J with |Nix/g(s)| < c is the same 
as the question of finding a nonzero lattice point in the set for which N(w) < c. 
Once we sort out how large c has to be for the answer to be affirmative, then 
the inequality of the theorem will result. The tool will again be the Minkowski 
Lattice-Point Theorem (Theorem 5.16), but the difficulty is that the set for which 
N(@) < c is not necessarily convex. 

The nature of the set for which N(w) < c becomes clearer by considering the 
case of K = Q(./m) with m > 0. The map © carries x + y./m for x and y in 
Q to the pair (x + y./m, x — y./m) in R?, and if we parametrize w by the pair 
(x, y), then the set for which N(w) < c is the part of the (x, y) plane containing 
the origin and bounded by the two hyperbolas x” — my? = c and x? —my* = —c. 
This set is not convex, and it is not even bounded. 

Briefly, an individual coordinate of our Q = R" x C”, whether a factor of 
type R or a factor of type C, contributes something compact convex to the set 
for which N(w) < c as long as the other coordinates are fixed, but as soon as 
we allow more than one coordinate to vary, then the product formula defining 
N(q@) produces sets that are neither convex nor bounded. To use Theorem 5.16, 
we want to inscribe a compact convex set within the set for which N(w) < c, 
making the inscribed set contain the origin, be closed under negatives, and have 
volume as large as possible. 

If we were trying to inscribe such a compact convex set in a region cut out by 
two hyperbolas as above, then the best possible set to use would be a rectangle 
with sides parallel to the axes. However, the description above in terms of those 
two hyperbolas used a noncanonical parametrization of elements of Q(,/m ) as 
all rational combinations x + y,/m. 

Let us proceed for the general case by using only the structure that is given to 
us, without using any noncanonical parametrization. The things that are canonical 
are the factors IR and C, the functions || - ||; defined on them, and functions of these. 
For the example above, the function N(@) is given by N(w) = |@|1|@|2. The 
geometric set in R* = {(@ 1, @)} to consider is changed from above; it is still the 
set toward the origin from two hyperbolas, but the hyperbolas are changed to be 
@|@2 = +c, having the axes as asymptotes. The inscribed convex set becomes the 
set with |w;| + |w2| < 2c!/?. The containment of the latter set in the set toward 
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the origin from the two hyperbolas follows from the inequality |@j@|!/? < 
S(la1| + |@2|), which is a consequence of the inequality i (lo1| — |w|)* > 0. 
In the general case the inscribed convex set is described in terms of the function 


r) ri+r2 


T(@) = oS loli +2 » lel 


i=1 i=r,\t+1 


The set of w with T(@) < t, t being a positive constant, is evidently a compact 
convex set containing 0 and closed under negatives, and the functions T(w) and 
N(q@) are connected by the arithmetic-geometric mean inequality, which says 
that 


1/n 1 
N(@)'" < —T(o). 
n 


Because of this inequality the set with T(@) < ¢ is contained in the set with 
N(@) < t"/n". 

Since the absolute value in each R or C coordinate is canonical, so is the 
notion of volume, given on rectangular sets by taking products; as usual the 
understanding is that the set in a factor of R on which the absolute value is 
< k contributes a factor of 2k to the volume, and the comparable set in a factor 
of C contributes a factor of 7k?. If Vo denotes the volume of a fundamental 
parallelotope for the lattice ®(J) in the n-dimensional Euclidean space Q, then 
the Minkowski Lattice-Point Theorem says that the set with T(w) < t, and 
therefore also the set with N(w) < t”/n”, contains a nonzero lattice point as 
soon as the volume of the set with T(w) < t is > 2”Vo. In other words, as soon 
as the volume of the set with T(w) < t is > 2” Vo, there exists an s 4 Oin J with 
INxg(s)| < t"/n". 

To prove Theorem 5.21, we therefore need to know two things—the volume Vo 
of a fundamental parallelotope for (J) and the volume of the set with T(@) < f. 
Then we can find the smallest t for which the set with T(w) < t has volume 
> 2” Vo, and we can sort out the details. 

Let us compute the volume Vo. Let [ = (a 1,...,a,) be an ordered Z basis 
of the ideal J. The easy case in which to compute Vo is that rr; = n, i.e., that all 
the field embeddings of K into C are real. In this case the discriminant D(T) is 
the determinant of the n-by-n matrix [B;;] with 


n 


n n 
Bij = Trxa(@iaj) = D7 on (aia) = DY ox (ai on(aj) = DY Aix Aig, 


where [Aj;;] is the matrix with Aj; = o;(a@;). We recognize |det[Aj;]| as the 
volume of a fundamental parallelotope for ®(/), and therefore |D(I)| = Ve 
By Proposition 5.1, D(T) = N(J)*Dx, and therefore Vy = N(J)|Dx|!”. 
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This answer for the value of Vo is not correct if some of the embeddings of K 
into C are nonreal, since | det[o; (a; )]| no longer equals Vo. To see how to adjust 


matters, suppose that o is a nonreal field mapping of K into C. Then the n-by-n 
Z1 


matrix [o;(a;)] contains one column z = ( : corresponding to o and another 


gn 


Z1 
column Zz = ( : corresponding to G. The entries in the k" row tell how a, is 
embedded in Q" namely at some point z, = xz; + iy, for o and at Z, = x, — iyg. 
To compute Vo properly, we should have x; in one column and y, in the other, 
instead of z, and z,. We can transform from the matrix with columns containing 
zx and Zz to one containing x, and yx by first replacing the first column by the 
sum of the two, which is 2x, = zx + Zx, and by then replacing the second column 
by the difference of the second column and half the new first column, which is 
5 (Zx — Zx) = —iy,z. These operations do not change the determinant. Repeating 
these steps for each of the r2 pairs of nonreal field mappings, we obtain a matrix 
for which the absolute value of the determinant, apart from factors of 2 inrz of the 
columns, is Vo. Consequently Vo = 2~"?| det[o; (q;)]|. Then Vo = 2-71 D(T)|, 
and we obtain 
Vo = 2°-"N(J)|Dxl'”. 


Now let us compute the volume of the set of w in Q for which T(@) < t. Write 
@ = (X1,---5 Xp, Zep 415 +++» Zrytr,)- The volume is the integral of 1 over the set 
on which |x;| +--+ + |x;,| + 2]zZ-,41| + 2|Z-,4.,| < ¢. The set for the integration 
is invariant under x; +>» —x; and under rotation in any variable z;, and hence the 
volume equals 


2"! (20) Prt ttt Prytr, EX ++ dXp, Apr pi +++ dPrytrys 
E 


where E is the set on which all variables are > 0 and 


ry rytre 
Dy 2D Pret, 
i=l i=) 


Forr; +1 <i <7; +72, introduce x; = 2,;, and make the change of variables. 
Then the volume becomes 


ror 
rie | Ary tl Xry+ry dx): Xp, 4795 
' 


where E’ is the set of (x1, ..., Xn) in R" +” with all x; > Oand with 174? x; <1. 
Finally we make a change of variables that replaces each x; by ty;, and the result 
is that 
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volume({T (@) < t}) ae a / Yriti + Yritr, dy, ee Ayr 4ry, 
S 


where S is the standard simplex in R"'*’? with all y; > O and with "7? y; < 1. 
This definite integral is of a standard type that is evaluated by the following 
lemma. 


Lemma 5.23. In R”, let S be the standard simplex with all x; > 0 and with 
ca x; <1. Ifa,,...,a@, are positive real numbers, then 
P(a,)P(@) ++: Gn) 
Tq, +--+» +@m +1) 
REMARKS. The expression I"(- ) is understood to be the usual gamma function, 
whose value at positive integers is given by [(n + 1) = n!. We merely sketch 


the proof; the details can be found in many books that treat changes of variables 
for multiple integrals.’” 


a,—1.a—1 a4—1 
[ot Ky ee” AX +2 dX, = 


SKETCH OF PROOF. Let J be the unit cube, given by 0 < u; < 1forl <i <m. 
We make the change of variables x = g(u) that carries the points u of the cube 
I one-one onto the points x of the simplex S$ and that is given by 


1 =U), 


x2 = (1 —4))u2, 


Xm = el = uy) ee qd = Um—1)Um- 


The volume element transforms by the absolute value of the Jacobian determinant, 
specifically by 


dx = |g'(u)|du = (1 —uy)"""(1 — un)? «(1 — Um—1) du, 


and the result of the change of variables is that the given integral equals 


dd 1 m 
TD fata uy Bee ay. 
i=1 70 
The factors here can be evaluated by means of Euler’s formula 


Ms Sas _, T@r@) 
a—lry 4, yb-1 
[ u“ “(1 —u) — T@tb’ 


and the lemma follows. 


2One such is the author’s Basic Real Analysis; the details appear in the problems at the end of 
Chapter VI of that book. Another such book is Rudin’s Principles of Mathematical Analysis. 
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For the integral of interest to us, we have m =r) +1r2,a, =--: =a,, = 1, 
and a;,41 = +++ =4p,4r, = 2. Thusa;+---+a,_, =r) +2r2 = n, and we obtain 


PA) Qyite: | grimrageragn 
Taih nt 


volume({T (w) < t}) = 2" "nt" 


Finally we can put everything together. We are to solve for ¢ such that this 
expression is equal to 2” Vo, and then there exists an element s ~ 0 in J with 
INx/g(s)| < t”/n". Since Vo = 2-?.N(J)|DxK|'/7, the equation to solve for ¢ is 


Dir. ragn 
ae = 2"2-2.N(J)|Dg|'/2. 
nN: 


4 


Thus ¢” = (2) “nlN(J)|Dx|!/2, and the element s 4 0 in J satisfies 


4\" n! 
INx/a(s)| < (=) * |Dx|'?N(J). 
a n 


This completes the proof of Theorem 5.21. 


7. Problems 


1. Take as known that the discriminant of a cubic polynomial F(X) = X*+ pX+q 
is —(4p? + 27q7). In each of the following cases, let K = Q[X]/(F (X)) with 
F(X) as indicated, and verify that the field discriminant Dx is as indicated: 

(a) F(X) = X3-—X—-1, De = —23. 
(b) F(X) = X34 X41, De = -31. 

2. Let K = Q[X]/(F(X)), where F(X) = X? —2X?42. 

(a) Use the formula of the previous problem to show that the discriminant of 
the polynomial F(X) is —44. 

(b) Using Proposition 5.2, show that Dx cannot be —11, and conclude that 
Dx = —44. 

3. This problem computes the class number of K = Q 73). 

(a) Show that every equivalence class of nonzero ideals contains an ideal with 
norm < 4. 

(b) Show that the prime ideals whose norm is a power of 2 are P; = (2, /3— 1), 
whose norm is 2, and P2 = (2, SO+ 3+ 1), whose norm is 4. 

(c) Show for P; that 2 is a multiple of ie 1, and show for P» that 2 is a 
multiple of Y9 + V3 +1. 

(d) Show that the only prime ideal whose norm is 3 is (</3). 

(e) Deduce that the class number of K is 1. 
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Let R be the ring of algebraic integers in the number field K = Q( 7), and let 
I be the doubly generated ideal J = (2,1 + 7) in R. 

(a) Prove that N(J) = 2. 

(b) Prove that J is not a principal ideal. 


Problems 5—9 give an example of a nontrivial finite extension L/K of number fields 
in which no prime ideal for K ramifies in passing to L. By contrast, Corollary 5.22 
says that there always exists a prime that ramifies in passing from Q to a nontrivial 
finite extension. The example has L = Q(./—5, /—1) and K = Q(./—5). Let 
K’ = Q(V/5) and K” = Q(./—1). Observe that L/Q is a Galois extension, and so 
are all the various quadratic extensions of L over K, K’, and K”, as well as of K, RK’, 
and KK” over Q. The problems make use of the fact that ramification indices multiply 
in passing to an extension in stages, and so do residue class degrees. 


5: 


Show that the minimal polynomial of /—1 + /—5 over Q is X* + 12X* + 16, 
and deduce that the elements 5 ( /—1+/—5) are algebraic integers in L. 


By making use the formula for D(é) in terms of D(é), where & is an element in 
L, prove that |D(5(/—1 + /—5))| = 245. Consequently Dy, divides 2457. 


Verify the following decompositions of the ideals (2) and (5) when extended 
from Z to the rings R, R’, and R” of algebraic integers in K, K’, and K”: 

(a) (2)R = »* with f = 1, and (5)R = @” with f = 1. 

(b) (2)R’ = g with f = 2, and (5)R’ = @? with f = 1. 

(c) (2)R" = w* with f = 1, and (5)R” = 12 with f = 1. 


Let T be the ring of algebraic integers in L. Since L/Q is a Galois extension, the 
only possible decompositions of (p)T , when p is a prime number, have (e, f, g) 
equal to (4, 1, 1) or (2, 2, 1) or (2, 1, 2) or C1, 4, 1) or C1, 2, 2) or C1, 1, 4). Here 
e is the ramification index, f is the residue class degree, and g is the number of 
distinct prime factors. Using the product formulas for ramification degrees and 
comparing what happens for the passage Q C K’ C L with what happens for 
the passage Q C K” C L, show that the only possibilities for (p)T with p = 2 
and p = 5 are 

(a) (e, f.g) = (2,2, 1) for (2)T, ie., (2)T = P* with dimg,(T/P) = 2. 

(b) (e, f,g) = (2, 1,2) for (5)T, ie., (5)T = P?P? with dimp,(T/P)) = 

dimp,(T/P2) = 1. 


Return to the situation with @ C K C L, where K = Q(./—5). According to 

Problem 7a, the prime decompositions of (2)R and (5)R are (2)R = 95 and 

(5)R = 93. 

(a) Using the results of Problem 8, show that 2T = P and gsT = P; Po, i., 
§02T is prime, and g95T is the product of two distinct prime ideals. 
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(b) Show how to conclude from these facts and from Theorem 5.6 that no prime 
ideal in R ramifies in 7. (Educational note: The field L is the “Hilbert class 
field” of K in the sense of Section 1; the order of the Galois group Gal(L/K) 
matches the class number of K.) 


Problems 10-16 concern the cyclotomic field K = Q(e?7'/P), where p>2isa 
prime number. They show that the discriminant is given by Dx = p?~? and that a Z 


basis of the ring R of algebraic integers in K consists of {1, ¢, re Lane gPaey. where 
t= e2tilP 
10. Show that KK has no real-valued field mappings into C, and deduce that Nx/g(x) 


11. 


12. 


13. 
14. 


15. 


16. 


is positive for every x ~ 0 in K. 

Let F(X) = X?-! + xP? 4... 4+ 1 be the minimal polynomial of ¢ over Q, 

and let G(X) = F(X + 1). Suppose that k is an integer with GCD(k, p) = 1. 

(a) Prove that G(X) is the minimal polynomial of c* — 1, and deduce that the 
norm of ¢* — 1 is given by F(1) = p. 

(b) Why does it follow that Nxjg(U — ck) =p? 

(c) Prove that (1 — chy/d —¢)isaunit of R. 

With notation as in the previous problem, prove that the different D(¢") of ¢* 

has [D(¢*)| = p/|¢* — 1). 

Deduce from the previous problem that D(¢) = (—1)?~D?-2)/? pe-?_ 


Let A = 1 — ¢. Problem 11b shows that Nixjg(A) = p. Prove that 

(a) the Zspanof {1,¢, E?, Seuss gene equals the Z span of {1, 2, 22,0 APTA. 
(b) an equality p = [][?-} (1 — ¢*) holds. 

(c) there exists a unit e of R such that p = e(1 — gy =edP-!, 

Using Problem 14c, prove that the principal ideals (p) R and (A) in R are related 
by (p)R = (A)?—!, and deduce from this fact that (A) is a prime ideal. 


Apply Proposition 5.2 to the Q basis {1, A, A”, ..., 42-7} of K lying in R to show 
that no factor of ‘pe can be eliminated from D(A) = D(¢); take into account the 
highest powers of 4 that divide each term. Conclude that Dk = D(¢) and that 
{1,¢,¢7,..., 7-7} is a Z basis of R. 


Problems 17-18 use the same notation as in the text of the chapter: K is a number 
field of degree n over Q, R is its ring of algebraic integers, Dx is its field discriminant, 
the field mappings of K into C are denoted by o; for 1 < i < n,r; of the o;’s are 
real-valued, and rz complex-conjugate pairs of the o;’s are nonreal. 


17. 
18. 


Prove that the sign of Dx is (—1)”. 


(Stickelberger’s condition) Let [ = (a 1,...,a@,) be an ordered n-tuple of 
members of R linearly independent over Q, and suppose that K/Q is a Galois 
extension. Write det[o;(a;)] = P — N, where P is the sum of all the terms of 
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the determinant corresponding to even permutations and N is the sum corre- 
sponding to even permutations. Using Galois theory, prove that P + N and PN 
are in Z. Then write D(I’) = (det[o;(a;)])* = (P+N)*—4PN, and deduce that 
the integer D(I’) is congruent to 1 or 0 modulo 4. (Educational note: A variant 
of this argument proves the same conclusion about D(I’) without the assumption 
that K/Q is a Galois extension. One makes use of the smallest normal extension 
of Q containing K; this is the splitting field of the minimal polynomial of any 
primitive element of K.) 


Problems 19-23 continue with the notation of Problems 17—18. It is to be proved that 
a suitable localization S~!R of R is a principal ideal domain for which the group of 
units is finitely generated as an abelian group. Let h be the class number of K. 


19. Let ),..., Z, be ideals representing all the equivalence classes of ideals in R. 
For each J;, let uj be a nonzero element of J;, and put u = u;---u;,. Define 
S = {l,u,u’,...}. Prove that S~!R is a principal ideal domain. 


20. (a) Prove that if a member a of R divides u* within R for some k > 0, then a 
isaunitin S-'R,ie.,a~! isin S“!R. 
(b) Prove conversely that if a member a of R has the property that au 
unit in S~!R for some m > 0, then a divides u* within R for some integer 
k>0. 


“" isa 


21. Let P;,..., P; be the distinct prime ideals appearing in the unique factorization 
of (wu), and suppose that re = (bj) for 1 < j <1. Letau™ and k be as in 
Problem 20b, and write ué = ab with b € R. 

(a) Why must each b; necessarily be a unit in SR? 

(b) Prove that there exist integers nj > 0 for 1 < j </ such that the element 
d= II bi has (a) = (d)P;! tee ae for some integers t; withO < t; < h—1. 

(c) In this case, why must P/' --- P," be a principal ideal? 


22. Suppose that there are N tuples (e],..., e;) with 0 < e; < h — 1 forall j such 
that Pj! --- P," is a principal ideal. For the i such tuple, let the principal ideal 
be denoted by (c;), 1 < i < N. Prove that if k, a, and b are as in the previous 
problem and if the principal ideal in (c) of that problem is (c;), then a = bc;e 
for some ¢ in R*. 


23. Conclude from the three previous problems that the group of units of S~!R is 
finitely generated as an abelian group. 


Problems 24—32 complete the discussion in Section 4 of Dedekind’s example of a 
cubic extension of Q with a common index divisor. The field is K = Q(&), where 
€ is a root of F(X) = X? 4+ X? — 2X + 8, and it was shown in Section 4 that 
D(é) = —2? - 503. Let R be the ring of algebraic integers in K. It will be shown that 
R is a principal ideal domain. 


24. 


25: 


26. 


27. 


28. 


29. 


30. 
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Show that n = 4/é is a root of the polynomial G(X) = X* — X* +2X +8, and 
conclude that 7 is in R. 


(a) By rewriting F(€)/é in terms of and n, show that £7 + & —-2+2n=0. 

(b) By rewriting G(n)/n in terms of € and n, show that 2 + 2— +7? =0. 
Conclude from this formula and (a) that products of € and 7 may be simplified 
according to the table 


§=-§4+2-2n, yf =-2-2t+y, En=4. 
(c) Using the first formula in (b), deduce the containment of abelian groups 
given by Z({1, , €7}) C Z({1, &, n}). 
(d) Using the first formula in (b), deduce that 7 does not lie in Z({1, &, £71), 


(e) Conclude from the above facts that {1, €, 7} and {1, &, 5(é? + é)} are Z, 
bases of R. 


Let P be a prime ideal in R containing (2)R, write F for the field R/P, let 
gy : R — F be the quotient homomorphism, and let = g(€) and 7 = y(n). By 
applying ¢ to the table in Problem 25b and using the fact that the additive group 
generated by {1, &, 7} is all of R, prove that F has only two elements, i.e., that 
the residue class degree is f = 1, and that the only possibilities for @ are the 
following: 


= 00 with goo(§&)=0, ¢goo(n) =9, 
¢=¢10 with go8)=1, gio) =9, 
g=901 with goi(&)=90, goi(y) = 1. 


Conversely show that the three functions ¢0,0, 91,0, go,1 defined on € and n in 
the previous problem extend to well-defined ring homomorphisms of R onto F2. 


Let Po,0, Pi,9,and Po; be the kernels of the ring homomorphisms in the previous 
problem. Prove that these ideals all have norm 2 and that (2)R = Po,0 P1,0 Po,1. 


(a) Prove that Poo = (2,&, 7), Pio = (2,6 +1, n), and Po = (2,6,n+4+ 1). 

(b) Exhibit 7 as a member of the ideal (2,€ + 1), and show therefore that 
Pio = (2,€4+1). 

(c) Similarly show that Po; = (2, + 1) and that Poo = (2,€ — n). 

The previous problem exhibited Poo, Pi,9, and Po,; explicitly as doubly gener- 

ated. In fact, use of the norm map Nx,g will ultimately show them to be principal 

ideals. 

(a) Show that if H (X) is the field polynomial over Q of an element 6 in K, then 
Nx/o(@) = —H (0) and Nx/o(@ — g) = —H(q) for every g € Q. 

(b) Prove that Nxjo(€) = Nxjo(n) = —8 = —23, that |Nxjo(€ + 3)| = 27, 
that |Nxo(€ — 1)| = |Nxo(€ + 2)| = 23, and that |Nx/g(— — 2)| = 2+. 
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31. 


32. 


(c) 


(d) 
(e) 


(f) 
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Prove that (§) = Peart oo; for unique exponents > 0 whose sum is 3, 
and that (7) = ynan 0 Poy for unique exponents > 0 whose sum is 3. 
Using the fact that €n = 4, prove thata+a=b+fB=c+y=2. 

Using the definitions of Poo, Pj,9, and Po,; as kernels, prove that b = 0 and 
y =0. 

Conclude that (€) = Po,o Pg, , and that (7) = PooP?o. 


This problem uses the norm computations in Problem 30b. 


(a) 


(b) 
(c) 


(d) 
(e) 


(f) 
(a) 


(b) 
(c) 
(d) 
(e) 


(f) 
(g) 


Using the defining homomorphisms, show that if / is an odd integer, then 
P; 9 contains (€ +1), but Poo and Po; do not. 

Show that (€ + 3) = gan and that (€ — 1) = Pigs 

Using the defining homomorphisms, show that if / is an even integer, then 
Po,1 contains (§ + /), but P;,9 does not. 

Show that (2, €) = Poo Po.1. 

Show that if/ is an even integer not divisible by 4, then Pei does not contain 
(+0). 

Show that (€ + 2) = Pj.9Po,1 and that (§ — 2) = Pj Po.. 

From the identity (§ + 2)Po9 = (€ — 2) that results from Problem 31f, 
deduce that ro,9 = i is in R and that Po 9 = (70.0). 

Deduce similarly that P},9 and Po,; are principal ideals. 

Using Theorem 5.6, show that R contains no ideals of norm 3. 

Using Theorem 5.6, show that the only ideal in R of norm 5 is (5, 1+). 
Show that |Nx/g( + &)| = 10, and deduce that (1 + €) = (5,1+ &)P, 
where P is one of the three ideals Poo, Pi,o0, and Po.1. 

Why does it follow that (5, 1 + &) is a principal ideal? 

Prove that R is a principal ideal domain. 


CHAPTER VI 


Reinterpretation with Adeles and Ideles 


Abstract. This chapter develops tools for a more penetrating study of algebraic number theory than 
was possible in Chapter V and concludes by formulating two of the main three theorems of Chapter 
V in the modern setting of “adeles” and “ideles” commonly used in the subject. 

Sections 1—5 introduce discrete valuations, absolute values, and completions for fields, always 
paying attention to implications for number fields and for certain kinds of function fields. Section 1 
contains a prototype for all these notions in the construction of the field Q,, of p-adic numbers formed 
out of the rationals. Discrete valuations in Section 2 are a generalization of the order-of-vanishing 
function about a point in the theory of one complex variable. Absolute values in Section 3 are 
real-valued multiplicative functions that give a metric on a field, and the pair consisting of a field and 
an absolute value is called a valued field. Inequivalent absolute values have a certain independence 
property that is captured by the Weak Approximation Theorem. Completions in Section 4 are 
functions mapping valued fields into their metric-space completions. Section 5 concerns Hensel’s 
Lemma, which in its simplest form allows one to lift roots of polynomials over finite prime fields 
Fp to roots of corresponding polynomials over p-adic fields Q,. 

Section 6 contains the main theorem for investigating the fundamental question of how prime 
ideals split in extensions. Let K be a finite separable extension of a field F’, let R be a Dedekind 
domain with field of fractions F’, and let T be the integral closure of R in K. The question concerns 
the factorization of an ideal pT in T when p is a nonzero prime ideal in R. If F, denotes the 
completion of F with respect to p, the theorem explains how the tensor product K @F F'p splits 
uniquely as a direct sum of completions of valued fields. The theorem in effect reduces the question 
of the splitting of pT in T to the splitting of F’, in a complete field in which only one of the prime 
factors of pT plays a role. 

Section 7 is a brief aside mentioning additional conclusions one can draw when the extension 
K/F is a Galois extension. 

Section 8 applies the main theorem of Section 6 to an analysis of the different of K/F and 
ultimately to the absolute discriminant of a number field. With the new sharp tools developed in the 
present chapter, including a Strong Approximation Theorem that is proved in Section 8, a complete 
proof is given for the Dedekind Discriminant Theorem; only a partial proof had been accessible in 
Chapter V. 

Sections 9-10 specialize to the case of number fields and to function fields that are finite separable 
extensions of F,(X), where Fy is a finite field. The adele ring and the idele group are introduced 
for each of these kinds of fields, and it is shown how the original field embeds discretely in the 
adeles and how the multiplicative group embeds discretely in the ideles. The main theorems are 
compactness theorems about the quotient of the adeles by the embedded field and about the quotient 
of the normalized ideles by the embedded multiplicative group. Proofs are given only for number 
fields. In the first case the compactness encodes the Strong Approximation Theorem of Section 8 
and the Artin product formula of Section 9. In the second case the compactness encodes both the 
finiteness of the class number and the Dirichlet Unit Theorem. 
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1. p-adic Numbers 


This chapter will sharpen some of the number-theoretic techniques used in 
Chapter V, finally arriving at the setting of “adeles” and “ideles” in which many 
of the more recent results in number theory have tidy formulations. Although 
Chapter V dealt only with number fields, the present chapter will allow a greater 
degree of generality that includes results in the algebraic geometry of curves. 
This greater degree of generality will not require much extra effort, and it will 
allow us to use each of the subjects of number theory and algebraic geometry to 
motivate the other. 

The first section of Chapter V returned to the idea that one can get some 
information about the integer solutions of a Diophantine equation by considering 
the equation as a system of congruences modulo each prime number. However, 
we lose information by considering only primes for the modulus, and this fact 
lies behind the failure of Chapter V to give a complete proof of the Dedekind 
Discriminant Theorem (Theorem 5.5). The proof that we did give was of a related 
result, Kummer’s criterion (Theorem 5.6), which concerns a field Q(€), where 
€ is a root of an irreducible monic polynomial F(X) in Z[X]. The statement of 
Theorem 5.6 involves the reduction of F(X) modulo certain prime numbers p 
and no other congruences. 

The Chinese Remainder Theorem tells us that a congruence modulo any integer 
can be solved by means of congruences modulo prime powers, and the formulation 
of Theorem 5.6 uses only congruences modulo primes raised to the first power. 
Let us strip away the complicated setting from such congruences and see some 
examples of how the use of prime powers can make a difference. 


EXAMPLES. 


(1) Consider the problem of finding a square root of 5 modulo powers of 2. 
For the first power, we have 


SS SS 1) 4 or = 6S WH 1) med 2, 


i.e., x? — 5 is the square of a linear factor modulo 2. For the second power, the 
computation is 


x7 —-5= (x YN@t+1)-4=C@-—-1)%+ 1) mod 4, 


and x? — 5 is the product of two distinct linear factors modulo 4. For the third 
power, x” — 5 is irreducible modulo 8 because the only odd squares modulo 8 are 
+1. Thus the polynomial x* —5 exhibits a third kind of behavior when considered 
modulo 8. For higher powers of 2, the irreducibility persists because a nontriv- 
ial factorization modulo 2* with k > 3 would imply a nontrivial factorization 
modulo 8. 
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(2) Consider the problem of finding a square root of 17 modulo powers of 2. 
We readily compute that 

ea Se = 1) $e S18) SH 6 =— 1) ned 2, 

x?—-17=(x —1l(*+1)— 16 = — 141) mod 4, 

x?—-17=(x—1)(«x+1)— 16 = — 1) 4+ 1) mod 8, 

x? —17 =(x—1)(x+1)— 16 =(« — 1)(x + 1) mod 16, 

x? —17 = (x —7)(x +7) +32 = (x —7)(x +7) mod 32, 

x? — 17 = (x —9)(x +9) + 64 = (x — 9)(x +9) mod 64, 


ie., that the factorization of x? — 17 begins in the same way as for x” — 5 but that 
x? — 17 continues to factor as the product of two distinct linear factors modulo 
23, 2+, 25, and 2°. We can argue inductively that this pattern persists through all 
higher powers. In fact, suppose that x? — 17 = (x — m)(x +m) mod 2 for an 
integer k > 3. Then 
x? —17 =x? —m’? +a2‘, 
and m must be odd. Then we can write 
SIPS Si Sa | ae Se ee? ). 


The factor (1 — m + a2*~) is even, and this equality shows that x? — 17 is the 
product of two distinct linear factors modulo 2‘*!. This completes the induction. 


One immediate observation from the two examples is that the factorizations 
of x” —5 and x” — 17 are the same modulo 2 and modulo 27 but are qualitatively 
distinct modulo higher powers of 2. Another observation is the nature of the data 
produced by the inductive argument in Example 2: For each k, we obtain an odd 
integer m, such that m: = 17 mod 2*, and the m,’s are constructed in such a 
way that my.) = mz — apo at m: = 17+ 4,2". It follows that if / > k, then 
mx — mr, is divisible by 2—', i.e., by higher and higher powers of 2 as k increases. 

A first conclusion is that we get additional information by using congruences 
modulo prime powers. A second and more subtle conclusion is that it would be 
desirable to regard the sequence {m,;} as stabilizing in some sense; then we could 
regard the system of congruences modulo all powers 2 as having a single pair 
of solutions that we can consider as square roots of 17. In this case we would 
not have to think about infinitely many solutions to infinitely many unrelated 
congruences. 

The construction that is to follow in this section, which is due to K. Hensel, 
will capture this information as a single “2-adic number.” Conversely the 2-adic 
number carries with it the congruence information modulo 2* for all positive 
integers k. 
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Thus the revised method of considering congruences prime by prime will be 
a two-step process, first a step of “localization” and then a step of “completion.” 
In our application in Chapter V, we did not explicitly make use of localization 
in the sense of Chapter VIII of Basic Algebra, but it was there implicitly —in 
Proposition 5.2 for example and in the proof of Theorem 5.6. Carrying out the 
details of setting up the theory behind the two-stage process will take some work 
and will occupy the first four sections of this chapter. Let us get started. 

Let p be a prime number. We define a real-valued function | - | p on the field 
Q of rationals as follows: we take |0| p= 9, and for any rational r = p”ab7! 
with a and b equal to integers relatively prime to p, we define |r|, = p~”. The 
function | - | F is called the p-adic absolute value on Q. It has the following 
properties: 

(i) |x|, 20 with equality if and only if x = 0, 

(ii) |x + yl, <max(\xI,,. |yI,)> 

Gii) |xy|, = xl bly 

(iv) |— 1], = |1|, = 1, and 


(v) |—x1, =Ialp- 
In fact, with (ii), equality holds if Ix|, # Iylp> and the case with Ix|, = lyl, 
comes down to the observation that ¢ + 5 = we has no factor of p in its 


denominator if b and d are relatively prime to p. Property (111) comes down to the 
fact that if a, b,c, d are relatively prime to p, then so are ac and bd. The other 
properties follow from the first three: To see that |1|, = 1 in (iv), we observe 
from (iii) that |1|, is a nonzero solution of x* = x and thus has to be 1. This 
conclusion and (iii) together show that | — 1| Fe is a positive solution of x7 = 1 and 
thus has to be 1. Property (v) follows immediately by combining (iii) and (iv). 

Inequality (ii) is called the ultrametric inequality. It implies that |x + y| ees 
|x|, +ly|,,and consequently the function d(x, y) = |x — YI>p satisfies the triangle 
inequality 

d(x, y) <d(x,z)+d(z, y). 


Since (i) shows that d(x, y) > 0 with equality exactly when x = y and since (v) 
implies that d(x, y) = |x — y|, = d(y, x), the function d on Q x Qis a metric. 
It is called the p-adic metric on Q. 

The field Q, of p-adic numbers will be obtained by completing this metric and 
extending the field operations to the completion. Let us see to the details. Regard 
the space Tz: Q of sequences {q;}72, of rational numbers as the direct product 
of copies of the ring Q, the operations being taken coordinate by coordinate. 
Then Ws Q is acommutative ring with identity, the identity being the sequence 
whose terms are all equal to 1. 
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As is usual for metric spaces, we say that a sequence of rationals, i.e., a member 
{qj} of Ws Q, is convergent tog € Qin the p-adic metric if for any real e > 0, 
there exists an integer N such that |g, — q|, < € for alln > N. Convergence 
in this metric is quite different from what one might expect; for example the 
sequence {2/ yey is convergent to 0 when p = 2. The sequence {q;} is a Cauchy 
sequence in the p-adic metric if for any real € > O, there exists an integer N 
such that |gm — Gn| ae for allm > N andalln > N. Convergent sequences 
are Cauchy, as follows from the inequality |gm — dnl, < l¢m — 4p +19 — nlp. 
Cauchy sequences need not be convergent, but every Cauchy sequence {g,} is 
bounded in the sense that there is some real C with |q,|,, < C for alln. 


EXAMPLE 2, CONTINUED. We obtained a sequence {m;} of odd integers such 
that / > k implies that m, — m7 is divisible by 2‘! and mz — 17 is divisible by 2°. 
In terms of the 2-adic absolute value, |m, =|, < 2-&-) and |mz. S17, ok. 
The sequence {m,} is therefore a Cauchy sequence in the 2-adic metric, and the 
sequence {m7} is convergent in the 2-adic metric to 17. 


It follows from the ultrametric inequality that the sum and difference of Cauchy 
sequences is bounded, and (ii) and the boundedness of Cauchy sequences implies 
that the product of two Cauchy sequences is Cauchy. Therefore the subset 7 of 
Cauchy sequences is a subring with identity within We Q. 

In the theory of metric spaces, one defines a suitable notion of equivalence of 
Cauchy sequences, and the set of equivalence classes becomes a complete metric 
space,! any member q of Q being identified with the constant Cauchy sequence 
whose terms all equal g. With the p-adic metric, one can then prove that the field 
operations extend to the completion, and the completion is the field of p-adic 
numbers. This verification is a little tedious when done directly, and we can 
proceed more expeditiously by using some elementary ring theory. 

Since convergent sequences are Cauchy, the set Z of sequences convergent to 0 
is a subset of the ring ?. The sum or difference of two such sequences is again 
convergent to 0, and Z is an additive subgroup. We shall show that Z is in fact 
an ideal in R. Thus let {z,,} be convergent to 0, and let {g,} be Cauchy. Since 
{dn} is Cauchy, it is bounded, say with In|, <M for all n. If € > 0 is given, 
choose N such that n > N implies |z,| ee: /M. Thenn > N implies that 
IZnQnlp = Znlpldnl, < («€/M)M = «. Hence {z,g,} is convergent to 0, and Z is 
an ideal in R. 


Proposition 6.1. With the p-adic absolute value imposed on Q, let R be the 
subring of jas Q consisting of all Cauchy sequences, and let Z be the ideal in 


'This construction is carried out in detail in Section II.11 of the author’s Basic Real Analysis. 
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R consisting of all sequences convergent to 0. Then Z is a maximal ideal in Rk, 
and the quotient R/T is a field. Consequently the Cauchy completion of Q in the 
p-adic metric is a topological field Q,, into which Q embeds via a field mapping. 
If | - | p denotes the function d(-,0) on Q,, then | - | ; is a continuous extension 
of the p-adic absolute value from Q to Q,, and it satisfies 


(a) |x|, 20 with equality if and only if x = 0, 
(b) |x + yl, < max({x[,, |yl,),and 
(c) Ixyl, = lxlplyly- 
The subset Z, = {x EQ, | Ix|, < 1} is an open closed subring of Q, in which 


Z is dense, and Z, is compact. Consequently the topological field Q, is locally 
compact. 


REMARKS. The field Q, is called the field of p-adic numbers, and the ring 
Zp is called the ring of p-adic integers. The ring Z, contains the identity of Q,. 


PROOF. First let us prove that Z is a maximal ideal. Arguing by contradiction, 
let {gn} be a Cauchy sequence that is not in Z, i.e., is not convergent to 0. Then 
there exists an €9 > 0 such that |g,| p = 0 for infinitely many n. Choose N such 
that |¢n — dm| < €9/2 whenevern > N andm > N,and find some np > N with 
Ino pee i Then n > N implies that |q,| pa €0 /2 because otherwise we would 
have €) < Idnolp < |dn — Anolp + Idnl, < €9/2 + €9/2 = €, contradiction. Let 
{r,} be the sequence withr, =O forn < N andr, = ee forn > N.Forn >N 
and m > N, we have 


Irn —Tmlp = lay’ — Gm lp = |@m — In)/ Gm Gn) Ip 


= |gm — dal pldmly'ldnly | < 4€9 lam — nlp. 


and it follows that {r,,} - is Cauchy and hence lies in R. Since Z is an ideal in FR, 
{rndn} is Cauchy. The terms of the sequence {r,g,} are all equal to 1 forn > N, 
and hence {r,,q,} differs from the identity of R by a member of Z. Consequently 
the identity is in Z. This is a contradiction, since the members of the constant 
sequence {1} are at distance |1 — 0| p=l from 0. Hence Z is a maximal ideal, 
and /Z is necessarily a field. 

Meanwhile, the Cauchy completion Q, of Q is the set of equivalence classes 
from R, two members of R being equivalent if they differ by a sequence conver- 
gent to 0. Consequently the Cauchy completion Q, is precisely R/Z as a set. The 
mapping Q > R — R/T carrying a member g of Q to the constant sequence 
{qn} with all g, = g and then from F to the quotient R/Z = Q,, evidently respects 
the operations and hence is a field mapping. This mapping identifies Q with a 
subset of Q,. The metric d on Q extends uniquely to a continuous function on 
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the completion Q, x Q,, and therefore the p-adic absolute value | - | >= d(-,0) 
extends to a continuous function on Q,. 

Property (a) for the function | - | p on Q, follows from the fact that the 
continuous extension of d is a metric on Q,. To see that (b) and (c) hold on 
Q,, let x and y be members of Q, = R/Z, and let {gn} and {r,} be respective 
coset representatives of them in R. Then {qn +7n} and {Gnrn} are representatives 
of x + y and xy by definition, and the continuity of the p-adic absolute value on 
Q, implies that lim, |g¢n + Tap =|x+yl, and lim, Idn?'n|, = \xy|,- From the 
first of these limit formulas and from (b) on Q, we obtain 


bx + ylp = lim sup Ign + rly < lim sup max (dnl), Irnlp) = max(lx|p> Lylp): 
since lim, ldnlp = |x|, and lim, Irnly = lylp- This proves (b) on Q,. Similarly 
ely = lim gnrnly = lim Ign play = im Ign) Gim Irnl) = [Illy 


and this proves (c) on Q,. 

To see that addition, subtraction, and multiplication are continuous onQ, xQ,, 
let {x,} and {yn} be convergent sequences in Q, with respective limits x and y. 
Use of (b) on Q, gives 


(Xn + Yn) — & +p =1AQn —*) + On — Wp S max([Xn — X1,,1¥n — lp): 


The right side has limit 0 in R, and therefore x, + y, has limit x + y in Q,. A 
completely analogous argument, making use also of the equality | — 1|,, = |1|,,, 
shows that subtraction is continuous. Consider multiplication. If M is an upper 
bound for the absolute values |x,,| p and |yn|,,, then use of (c) on Q, gives 


Xn Yn — xyl, = |XnQyn — y) + On — Xx), 

< max([xn(In — Y)|plY@n — Dp) 

_ max(|Xn|,1¥n = Yip» ly] [Xn a X|p) 

< max(M|yn — Yip. 1 [pln — Xp): 
The right side has limit 0 in R, and therefore x,y, has limit xy in Q,. 

To see that inversion x +> x7! is continuous on Q , let {x,} be a sequence in 

Qs with limit x in Q. Since lim, |x,|, = |x|,, we can find an integer N such 
that IXnlp > 5|xl, forn > N. The computation 


-1 =I —1 
Ix, —x Ip = 1 — Xn)/OnX) |p = 1X — Xalp/(Xalplx|p) S 21e1, |X — Xp Pp? 
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! — x! and inversion is continuous. Conse- 


valid for n > N, shows that lim x, 
quently Q, is a topological field. 

It follows immediately from properties (b) and (c) and from the equality 
|—x|, = Ixl, that Z, is a subring of Q,. Since Z, is defined in terms of a 
continuous function and an inequality, it is closed. It can also be defined as 
the subset with |x|, < p because the p-adic absolute value takes no values 
between | and p, and therefore Z, is open. The most general nonzero member 
of Q/N Z, is of the form g = a/b, where a and b are relatively prime nonzero 
integers with la/D|, < 1. Here lb|, = 1, and p cannot divide b. If k > 0 is 
given, then it follows that there exists n with bn — a = 0 mod p*. This n has 
ln — blip = |bn — al, < p-*. Sog is in the closure of Z in Q,. In other words, 
the closure of Z contains QM Zp. Since Q is dense in Q,, Z is dense in Zp. 

For each integer n > 0, the set Z, is covered by the closed balls of radius 
p" centered at the integers 0,1,2,..., p” — 1. In fact, every integer z has z = 
k mod p” for some integer k € {0,1,2,..., p” — 1}. For this k, |z Kl <p”. 
Thus Z is contained in the union of the closed balls of radius p~” centered at 
0,1,2,..., p” —1. This union is closed; since Z is dense in Z,, Zp is contained 
in this union. In turn, these closed balls are contained in the open balls of radius 
pe centered at the integers 0,1,2,..., p” — 1. Thus for any positive radius, 
there exists a finite collection of open balls of that radius or less such that the 
union of the open balls covers Z,. This means that Z, is totally bounded in the 
metric space Q,. A totally bounded closed subset of a complete metric space is 
compact, and consequently Z, is compact. 

Thus the 0 element of Q, has Zp as a compact neighborhood. Since addition 
is continuous, x + Zp is a compact neighborhood of x, and therefore Q, is locally 
compact. 


2. Discrete Valuations 


The construction of the p-adic absolute value on Q seemingly made use of unique 
factorization of the members of Z, but actually the unique factorization of the 
ideals in Z would have been sufficient. Thus we shall see in a moment that the 
construction extends to apply to any number field F as soon as we specify a 
nonzero prime ideal P in the ring R of algebraic integers of F. In fact, there 
is nothing special about a number field. If R is any Dedekind domain and F is 
its field of fractions, then the construction extends to F as soon as we specify a 
nonzero prime ideal P in R. 

Before describing the extended construction, let us look at the definition of 
the p-adic absolute value on Q more closely. Recall that if x = p”ab7! for 
integers a and 5 relatively prime to p, then |x|, = p~”. Actually, the base p 
in this exponential is not very important at this point, and we could have used 
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any real number r > | in place of p in p-™”. With this adjustment the p-adic 
absolute value would have been given by |x|, = r—%) where Vp(x) is the exact 
net power of p that occurs when the prime factorizations of the numerator and 
denominator of x are used. The exponent v, (x) is what is important; the base r 
is unimportant. 

The expression v,(x) for Q is analogous to the order of vanishing of a poly- 
nomial in one complex variable at a point, and Hensel was led to the p-adic 
absolute value by carrying the notion for C[X] to the setting with Q. In setting 
up a generalization, we shall work first with the generalization of the order of 
vanishing v,(x), since it is the more primitive notion, and in Section 3 we shall 
exponentiate to obtain a generalization of the absolute value for which we can 
form a completion. 

To make the definitions, it is convenient to make use of fractional ideals, which 
were the subject of a set of problems in Chapter VIII of Basic Algebra. Let us 
recall the definition and the relevant properties. Again let R be a Dedekind 
domain, and let F be its field of fractions. A fractional ideal of F is any finitely 
generated R module M. For such an R module, there exists some a € R with 
aM C R, and then aM is an ideal of R. If M is any nonzero fractional ideal, 
then M~! = {x € F | xM © R} is anonzero fractional ideal, and MM~! = R. 
With this definition and property, it readily follows from the unique factorization 
of ideals in R that any nonzero fractional ideal M of F is of the form 


i 
M = I] ee 
j=l 


for asuitable set {P1,..., P;} of distinct nonzero prime ideals of R and for suitable 
nonzero integer exponents k;. This expansion is unique up to the order of the 
factors, and every such expression is a fractional ideal. It follows that the nonzero 
fractional ideals form a group under multiplication. At the end of this section, we 
shall mention how this group is related to the ideal class group of F as defined in 
Section V.6. 

If x ~ 0 is in F, then the principal fractional ideal (x) = xR has a 
factorization as above. If P is a nonzero prime ideal of R, we let vp(x) be 
the negative of the integer exponent of P in the prime factorization of (x). For 
example, if x is a nonzero element of R, then vp (x) is a nonnegative integer. To 
make vp(-) be everywhere defined on F’,, we define vp(0) = +00. Then vp(-) 
is function from F onto Z U {+00} such that 

(i) vp(x) = +00 if and only if x = 0, 
(li) vp(x + y) => min(vp(x), vp(y)) for all x and y, and 

(ili) vp(xy) = vp(x) + vp(y) for all x and y. 

We shall see in Proposition 6.4 below that the effect of up(-) is to pick out from 
F the localization of R at P. 
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To proceed further, we abstract the above construction and see what informa- 
tion we can recover from it. Let F be any field. A discrete valuation of F is a 
function v(-) from F onto Z U {oo} such that 

(i) v(x) = +00 if and only if x = 0, 
(li) v(x + y) = min(v(x), v(y)) for all x and y, and 

(ili) v(xy) = v(x) + vQy) for all x and y. 

Observe as a consequence that 
(iv) vo(—1) = v1) = 9, 
(v) v(—x) = v(x) for all x, and 
(vi) v(x + y) = v(x) if v(y) > v(x). 
In fact, v(1) = 0 follows by taking x = y = 1 in (iii), and then v(—1) = 0 
follows by taking x = y = —1 in (iii). This proves (iv), and (v) follows by 
combining (iv) with (iii) for x = —1. For (vi), we have v(x + y) > v(x) by 
(ii). In the reverse direction, v(x) > min(v(x + y), v(y)) by Gi) and (Vv); since 
v(y) > v(x), the minimum must be the first of the two, and thus v(x) > v(x+y). 

Define R, = {x € F | v(x) = 0}. Property (4) shows that 0 is in R,, (ii) and 
(v) show that R, is closed under addition and subtraction, (iii) shows that R, is 
closed under multiplication, and (iv) shows that | is in R,. Consequently R, is 
an integral domain. The ring R, is called the valuation ring of v in F’. 

If x isin F but is notin R,, then v(x) <0. This inequality forces v(x!) > 0, 
and x—! is in R,. As a consequence, F can be regarded as the field of fractions 
of R,. 

Let P, = {x € F | v(x) > O}. Arguing in similar fashion, we see that P, is 
an ideal in R,. Any x in R, that is not in P, has v(x) = v(x~!) = 0 and is thus 
a unit in R,. In other words, R, is a local ring with P, as its unique maximal 
ideal. The ideal P, is called the valuation ideal of v in F. We write k, for the 
field R,/ Py; it is called the residue class field of v. 


Proposition 6.2. Let v be a discrete valuation of a field F,, let R, be the 
valuation ring, and let P, be the valuation ideal. Then 
(a) R, is a principal ideal domain, 
(b) there exists an element z in P, with v(z) = 1, and any such z has 
Py = (x), 
(c) the nonzero ideals of R, are exactly the nonnegative integer powers of P, 
and are given by P? = (x) = {x € R, | v(x) > n} forn > 0, 
(d) the nonzero fractional ideals of R, are exactly the integer powers of P, 
and are given by P? = (w") = {x € R, | v(x) = n} forn € Z. 


REMARKS. When F' equals Q and v counts the net power of a prime number 
p dividing a rational number, we see by inspection that the ring R, is the local- 
ization of Z at p, consisting of all rational numbers with no factor of p in their 
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denominators. The choices” for z in (b) are the elements rp, where r is any 
nonzero rational whose numerator and denominator are both prime to p, and the 
nonzero ideals are of the form (p”) with n > 0. 


PROOF. The ideal P, contains an element a with v(z) = | because v(- ) is 
assumed to be onto Z U {+00}. Suppose that x is a nonzero member of P, and 
that v(x) =n > 0. Then v(x~"x) = 0, and the elements 2~"x and x~!z” lie in 
R,. Hence x = 2"(~"x) exhibits x as a member of (2”), and 2” = x(x7!z”) 
exhibits 2” as a member of (x). Consequently (x) = (z”). If J is a nonzero 
proper ideal in R,,, then it follows that J = 2”° R,,, where ng is the smallest integer 
such that some element xo of J has v(xo) = no. This proves (a), (b), and (c). 

Since R, is a principal ideal domain, it is a Dedekind domain, and the theory of 
fractional ideals is applicable. Since (c) shows the nonzero ideals to be all P,” with 
n > 0, it follows that the fractional ideals are all P’’ with n an arbitrary integer. 
For any integer n > 0, we have (w7~")P? =a "Ry x" R, = R, = Py" P}, and 
thus P,” = (z~"). The latter ideal equals 7" Ry = {x € R, | v(x) => —n}, and 
this proves (d). 


From property (vi) it follows for n > 0 that the members x of the set 1+ P” all 
have v(x) = 0. The product of two such elements is again in the set because P,’ 
is an ideal. Let us see that the multiplicative inverse x~! of a member x of the set 
is in the set. We calculate that v(x~! — 1) = v(x7!)+0(1—x) = 0+ v(1—x) = 
v(1—x) >n. Hence x7! isin 1+ P”, and 1+ P” isa group under multiplication. 
It is a subgroup of the group R* of units in Ry. 


EXAMPLE. When F = Q and v counts the net power of a prime number 
p dividing a rational number, the residue class field k, has p elements, with 
the integers 0,1,..., p — 1 being coset representatives. The group R* is the 
multiplicative group of rationals having numerators and denominators prime to 
p. The members of 1 + P” are rationals of the form 1 + p"ab~'!, where a and b 
are integers and b is prime to p. If we write this as b~!(b + p”a), we see that the 
condition on a rational to be in 1 + P’” is that its numerator and denominator be 
prime to p and be congruent to each other modulo p”. 


Now we return to our first example of a discrete valuation, which was con- 
structed from a nonzero prime ideal P in a Dedekind domain R. We called the 
valuation vp(-). We asserted earlier that the construction via vp(-) picks out 
the localization of R at P and the associated data. This assertion will be proved 
in Proposition 6.4 below. We begin with a handy lemma. 


Some books use the term “uniformizer” or “uniformizing element” for any generator zr of the 
principal ideal P,. The generators are exactly the prime elements of the ring Ry. 


324 VI. Reinterpretation with Adeles and Ideles 


Lemma 6.3. Let R be a Dedekind domain regarded as a subring of its field of 
fractions F’, let P be a nonzero prime ideal in R, and let vp be the valuation of F 
defined by P. Then any element x of F with vp(x) = 0 is of the form x = ab7! 
with a and b in R and vp(a) = up(b) = O. 


PROOF. If x is an element of F with vp(x) = 0, write x =a'b'~! witha’ € R 
and b’ € R. Then vp(a’) = vp(b’) = n for some integer n > 0. Since a’ and D’ 
are in R, (a’) and (b’) are ordinary ideals, and their prime factorizations are into 
ordinary ideals. Let the factorizations be (a) = P” Q, and (b’) = P” Qo, where 
Q, and Q> are products of prime ideals not involving P. Since we are dealing 
with ordinary ideals, a’ and b’ lie in P”. Choose an element z in the fractional 
ideal P~” that is not in P~"*!. By definition of P~", zP” is contained in R. 
Hence za’ and zb’ lie in R. Write (za’) = P”Q3 and (zb’) = P” Q4, where 
m > 0 and where Q3 and Qz are ordinary ideals whose prime factorizations do 
not involve P. Substituting for (a’), we obtain (z)P”Q; = P™Q3 and hence 
(z)P? = P”Q30,'. From this expression we see that Q30;' is an ordinary 
ideal. By definition of P-"+! (z)P"~! is not contained in R. Since (z)P"~! = 
P™-'03 OF ,it follows that m = 0. Similarly m’ = 0. Consequently vp(za‘) = 
up(zb’) = 0, and the lemma follows with a = za’ and b = zb’. 


Proposition 6.4. Let R be a Dedekind domain regarded as a subring of its 
field of fractions F, let P be a nonzero prime ideal in R, and let vp(-) be 
the corresponding valuation of F. If S denotes the multiplicative system in R 
consisting of the complement of P and if the localization S~'R is regarded as a 
subring of F, then the valuation ring Ry, coincides with S~'R and the valuation 
ideal P,,, coincides with S~!P. 


PROOF. The set S consists exactly of the members x of R with vp(x) < 0. 
Since vp is nonnegative on R, these are the members x of R with vp(x) = 0. 
Thus each x in S~!R has vp(x) > 0, and S~!R is a subset of Ryp- 

For the reverse inclusion, fix a member z of P that is not in P?. This element 
has vp(z) = 1. If x is given in R,, with vp(x) = n > O, then we can write 
x = m"u for some member u of F with vp(u) = 0. By Lemma 6.3 we can 
decompose u as u = ab~! with a and b in R and vp(a) = vp(b) = 0. The 
members of R on which vp takes the value 0 are exactly the members of S. Thus 
u is exhibited as the quotient of two members of S, and u is in S~'R. Since z is 
in the ideal P of R,x = 2"u isin S~!R. Hence Ry, = SIR. 

The ideal S~!P is a maximal ideal of S~'R = Ry,, and we observed just 
before Proposition 6.2 that P,, is the unique maximal ideal of R,,. Therefore 
SP SP: 


Let us investigate the nature of an arbitrary discrete valuation in various settings 
involving a Dedekind domain. The main general result of this section is as follows. 
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Theorem 6.5. Let R be a Dedekind domain regarded as a subring of its field 
of fractions F’, and let v be a discrete valuation of F such that R C R,. Then 


(a) P = RN P, is anonzero prime ideal of R, 

(b) the associated discrete valuation vp defined by P coincides with v, 

(c) PR, = Py, 

(d) R+ Py = Ry, and in fact R+ P = R, for every integer n > 1, and 
(e) the inclusion of R into R, induces a field isomorphism R/P = R,/ Py. 


PROOF. Since | is notin Py, the ideal P in (a) is proper. Ifa and b are members 
of R such that ab is in P, then ab is in Py, one of a and b is in P, as well as R, 
and P = RN P, is a prime ideal. The ideal P cannot be 0 because otherwise 
every nonzero element x of R would have v(x) = 0, in contradiction to the fact 
that F is the field of fractions of R. Thus P is a nonzero prime ideal of R. This 
proves (a). 

For (b) and (c), let us begin by showing that vp(x) = 0 implies v(x) = 0. By 
Lemma6.3 wecan writex = ab~! witha and bin R and with vp(a) = vp(b) = 0. 
The values of vp show that the members a and b of R are not in P. Since 
P = RN Py, neither a nor D is in P,. Therefore v(a) < 0 and v(b) < 0. 
Since R C R, by assumption, v(a) > 0 and v(b) => 0. We conclude that 
v(a) = v(b) = O and that v(x) = v(ab~!) = v(a) — v(b) = 0. 

Now we can show that v = vp and that PR, = P,. The ideal PR, of R, has 
to be of the form P¥ for some integer e > 0 by Proposition 6.2c, and the integer 
e has to be > O because | is notin PR,. If anonzero x € R has vp(x) = n for 
some integer n > 0, then xR = P”Q, where Q is an ideal of R whose prime 
factorization does not involve P. The function vp is 0 on Q, and the result of the 
previous paragraph shows that v is 0 on Q. Hence the members of Q are units in 
R,, and OR, = R,. Therefore xR, = xRR, = P"OR, = P"R, = (PR,)" = 
Pe", and v(x) = en = evp(x). Since F is the field of fractions of R, v = evup 
everywhere. The image of vp is ZU {+00}, and we conclude that e = 1. In other 
words, v = vp and PR, = P,. This proves (b) and (c). 

For the first conclusion in (d), we certainly have R + P, C R,. In the reverse 
direction, let x € R, be given. If v(x) > 0, then x is in P,, and there is nothing 
to prove. If v(x) = 0, then (b) and Lemma 6.3 together show that we can write 
x = ab7', where a and b are members of R but not P. Since R/P isa field, we 
can choose c in R with bc in 1 + P. Then 


x —ac =a(b"! —c) =ab"'(1 — be) = x(1 — bec). 
The right side is a member of R, P,, and (c) showed that R, P = P,. Therefore x 


is exhibited as the sum of the member ac of R and the member x (1 — bc) of Py, 
and we conclude that R + P, = R,. This proves the first conclusion in (d). 
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For the second conclusion in (d), we show inductively forn > 1 that P’~!+ P? 
= P"! the case n = | being what has already been proved in (d). Assume that 
case n has been proved. Multiplying the equality by P and using (c), we obtain 
Pe PPP ea PR ee VP Pet Sop) Since .P Py the termi PP is 
contained in P”*!, but increasing the left side in this way does not increase the 
right side. Thus P” + P”+! = P". This completes the induction. Using a second 
induction, we show that R + P”’ = R,. We have already proved this equality for 
n = 1. If we assume it for n and substitute from what has just been proved, we 
obtain R + (P? + Prey = R,, and this proves case n + 1 since P” C R. The 
second conclusion of (d) thus follows by induction. 

For (e), we are assuming that R C R,, and we have defined P = RM P,. Thus 
the inclusion R — R,, when followed by the passage to the quotient R,/P,, 
descends to the quotient as a field map R/P — R,/Py,. By (d), any member x of 
R, is the sum of amember y of R and a member z of P,; then y+ P is the member 
of R/P that maps to x + P, in Ry/Py. Thus the field map R/P — R,/P, is 
onto, and (e) is proved. 


Corollary 6.6. Let R be a Dedekind domain regarded as a subring of its field 
of fractions F. If x is a member of F such that v(x) > O for every discrete 
valuation v of F satisfying R C R,, then x lies in R. 


PROOF. We may assume that x 4 0. Write x = ab~! with a and b in 
R. Theorem 6.5 shows that the valuations in question are the ones determined 
by the nonzero prime ideals of R. If the principal ideals (a) and (b) factor as 
(a) = Pj! .-. P and (b) = Pi --. P& then 0 < up, (x) = vp (ab7!) = ji — kj 
for 1 <i <r. Thus j; > k; for alli, and the fractional ideal (ab~') equals the 
product ra Plans Prk , which is contained in R. Hence x = ab! lies in R. 


A finite field has no discrete valuations because of the requirement that the 
image of a discrete valuation be Z U {+00}. If we drop this requirement in the 
definition and let a be a multiplicative generator of a finite field, then any discrete 
valuation v would have v(a*) = kv(a) by property (ii). Taking k equal to the 
order of a and using that v(1) = 0, we obtain v(a) = 0. Thus if we drop 
the requirement about the image of a discrete valuation, the only possibility has 
v(0O) = +00 and v(x) = 0 for all x 4 0. Thus this setting is not very interesting. 

The settings in which discrete valuations v are of most interest to us are the 
following: 


(i) number fields, 
(ii) “function fields in one variable” over a base field, 


3This notion has not been defined thus far in the book but will be treated in Chapter VII. The 
fields in question are finite algebraic extensions of a field k(X), where X is an indeterminate and k 
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(iii) fields obtained from (i) or (ii) by a process of completion similar to that 
used in forming the field of p-adic numbers. 


The first of these are the initial subject matter of algebraic number theory, and the 
second of these are the initial subject matter of algebraic geometry — the geometry 
of curves. The third of these are used as a tool in studying the other two. Section 
VUL.7 of Basic Algebra explained parts of the analogy between the first two kinds 
of fields, and that is why we treat them together. We shall use Proposition 6.7 
below to determine their discrete valuations. In the case of (ii), the members of 
the base field k are regarded as constants, and the interest is only in valuations 
that are 0 on kk”. 


Proposition 6.7. Let R be a Dedekind domain, let F be its field of fractions, 
let K be a finite algebraic extension of F’, and let T be the integral closure of R 
in K. If a discrete valuation v of K is > 0 on R, then itis > OonT. 


REMARKS. We make repeated use in this chapter of the fact that T is a Dedekind 
domain in this situation. This fact was proved as Theorem 8.54 of Basic Algebra 
for the case that K is a finite separable extension of F,, but it is valid without 
the hypothesis of separability. The result without the hypothesis of separability 
will be proved in Chapter VII as part of an investigation of separable and “purely 
inseparable” extensions. 


ProoF. If x # 0 is in T, then the minimal polynomial of x over R is a monic 
polynomial in T [X], and thus there exist an integer n and coefficients dn_1,..., do 
in R such that 

5 = ape eet ae ea: 


Properties (ii) and (iii) of discrete valuations show from this equation that 
> i : ' 
nv(x) > pinin (v(aj) + jv(x)) 


Since v(aj) = 0, we obtain nv(x) > ming<j<,-1 jv(x), and it follows that 
u(x) => 0. Thus v is nonnegative on T. 


Corollary 6.8. The only discrete valuations of the field Q of rationals are the 
ones leading to the p-adic absolute value for each prime number p. If K is a 
number field and T is its the ring of algebraic integers, then the only discrete 
valuations of K are the valuations vp corresponding to each nonzero prime ideal 
P of T. 


is a field called the base field. At times later in the chapter, we shall be interested only in the case 
that the algebraic extension is separable. It will be proved in Chapter VII that for perfect fields k, 
this separability can always be arranged by adjusting the indeterminate X suitably. 
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PROOF. If v is an arbitrary discrete valuation of Q, then property (iv) of discrete 
valuations shows that v(—1) = v(1) = 0, and property (11) allows us to conclude 
that v is nonnegative on all of Z. Thus Z is contained in the valuation ring of v, 
and Theorem 6.5 applies. By (a) in the theorem, the intersection of Z with the 
valuation ideal is a nonzero prime ideal of Z, hence is pZ for some prime number 
p. Part (b) in the theorem then identifies v as the valuation corresponding to pZ. 
This proves the first conclusion. 

For the second conclusion, let v be a discrete valuation of K. The restriction 
to Q has to be a positive integral multiple of a discrete valuation of Q or else a 
function that is identically 0 on Q”. In either case, v is > 0 on Z, and Proposition 
6.7 shows that v is > 0 on 7. If R, denotes the valuation ring of v and P, denotes 
the valuation ideal, then this says that T C R,. We can therefore apply Theorem 
6.5. If P is defined by P = TM P,, then (a) in the theorem shows that P is a 
nonzero prime ideal, and (b) shows that v = vp. 


Let us now consider the field C(X), regarding it as having some properties in 
common with the number field Q. We want to know whether some analog of 
Corollary 6.8 is valid for C(X). The ring C[X] of polynomials is a principal ideal 
domain with C(X) as field of fractions, and the prime ideals of CLX] are all of 
the form (X — c) with c € C because C is algebraically closed. For each such 
c, we therefore obtain a discrete valuation vy_,). Are there any other discrete 
valuations? If we think geometrically about this question, we can regard C(X) 
as the rational functions on the Riemann sphere, and each discrete valuation 
addresses the order of vanishing of rational functions at some point of the sphere. 
For the points of the sphere that correspond to points c of C, such a valuation 
picks out the power of (X — c) by which the rational function should be divided 
in order to be regular and nonvanishing at c. The point oo on the Riemann sphere 
behaves differently. The usual technique in complex-variable theory is to replace 
X by 1/X and examine the behavior at 0. Following that prescription, we are led 
to a discrete valuation vg, that is not of the form vp for some prime ideal P of 
C[X]. The definition of v., on the quotient f(X)/g(X) of nonzero polynomials 
is 

Uoo( f (X)/g(X)) = deg g — deg f 


with vo.(0) = +00 as usual. The next proposition, which extends one of 
Liouville’s theorems in complex-variable theory* from C to a general field k, 
says that there are no other discrete valuations of interest for this example. 


Proposition 6.9. Let k be any field, and let F = k(X) be the field of rational 
expressions in one indeterminate over k. Regard F as the field of fractions of 


4For a meromorphic function on the Riemann sphere, the sum of the orders of the poles equals 
the sum of the orders of the zeros. 
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the principal ideal domain k[X]. Then the only discrete valuations of F that 
are 0 on the multiplicative group k* of nonzero constant polynomials are the 
various valuations vw), where p(X) is a monic prime polynomial in k[X], and 
the valuation v., that is defined on nonzero elements of F by 


Uoo( f (X)/g(X)) = deg g — deg f 


if f and g are polynomials. Moreover, any nonzero h(X) in F has 


Volh)+ Y> (deg p)uip(h) = 0. 
p(X) monic 
prime in R 


PROOF. Let v be a discrete valuation of F that is 0 on k*. First suppose that 
v(X) > 0. Being 0 on the coefficients, v is nonnegative on all polynomials. Thus 
k[X] is contained in the valuation ring of v, and Theorem 6.5 applies. By (a) in 
the theorem, the intersection of kLX] with the valuation ideal is a nonzero prime 
ideal of k[X], hence is (p(X)) for some monic prime polynomial p(X). Part (b) 
in the theorem then identifies v as the valuation corresponding to (p(X)). 

Next suppose that v(X) < 0. Since k[X ~!) has k(X) as field of fractions, the 
argument in the previous paragraph is applicable, and we find that v is the valuation 
determined by the prime ideal (X~!) in k[X7!]. In particular, v(X) = —1. To 
find v(f) for a general polynomial f(X) = a,X”" +---+a,X +apo in k[X] under 
the assumption that a, ~ 0, we write f as X"(a, +-++-+ dix’ ag"): 
The member a, +----+a,X!~" +ayX~" of k[X~'] is not divisible by X~! and 
thus v is 0 on it. Consequently v(f) = v(X”) = nu(X) = —n = — deg f. If 
f and g are both nonzero in k[X], then it follows that v(f/g) = v(f) — v(g) = 
—deg f + deg g = ve (f/g). That is, v = vg. 

To prove the displayed formula, write a given nonzero member h(X) of F as 
the quotient of two relatively prime polynomials, thus as h(X) = f(X)/g(X). 
Factor the numerator as f(X) = els p(X)" with c € k*, and factor the 
denominator similarly. If p(X) is a monic prime polynomial, then inspection of 
the formula for f (X) shows that u(p)(f) is k; if p = p; and is 0 otherwise. Hence 
Dy (deg p)wp)(f) = d7y_, ki deg pi = deg f. Subtracting this formula and a 
corresponding formula for g, we obtain 


>» deg p)up)(f/g) = deg f — deg g = —Uo0(h), 
Pp 


and the result follows. 


Corollary 6.10. Let k be a field, let F = k(X) be the field of rational 
expressions in one indeterminate over k, let K be a finite algebraic extension of 
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k[X], let T be the integral closure of k[X] in K, and let v be a discrete valuation 
of K that is 0 on the multiplicative group k*. Then the only possibilities for v 
are as follows: 


(a) v(X) > O, and there exists a unique nonzero prime ideal P in T such that 
U= Up, 

(b) v(X) < 0, and there exists a prime ideal P in the integral closure T’ of 
k[X—!] in K such that PQ k[X~!] = X~!k[X~!] and such that v is the 
valuation of K determined by P. 


REMARK. The ideals P that occur in (b) are the ones in the prime factorization 
of the ideal X~'T’ in T’. There is at least one, and there are only finitely many. 


PROOF. The argument is similar to the one for Corollary 6.8, except that 
we have to take into account what Proposition 6.9 says when v(X) < 0. The 
conclusion is that either v is > 0 on KLX], and then Proposition 6.7 and Theorem 
6.5 show that v is as in (a), or else v(X) < 0, and then Proposition 6.7 and 
Theorem 6.5 show that v is as in (b). 


To conclude, let us complete the remarks about fractional ideals begun early 
in this section. In the context that R is a Dedekind domain and F is its field of 
fractions, we mentioned that the nonzero fractional ideals of F form a group. We 
denote this group by Z. The nonzero principal fractional ideals form a subgroup 
P, and P is isomorphic to the multiplicative group F*. 

The point of the present discussion is that the group Z/P is isomorphic to 
the ideal class group of F as defined in the number-field setting in Section V.6. 
Recall the nature of this group. Two nonzero ideals J and J of R are equivalent 
if there exist nonzero members a and b of R with al = bJ. Proposition 5.18 
showed in the number-field setting that multiplication of such ideals descends to 
a multiplication on the set of equivalence classes and that the result is a group. 
This result holds for any Dedekind domain. The group is called the ideal class 
group of F; we denote it here by C. 

To verify that C = Z/P, we map each ideal J of R to its coset in Z/P. If J and 
J are equivalent ideals of R and al = bJ, then (ab~')I = J, and I and J map 
to the same coset. Thus C maps homomorphically into Z/P. If J maps into the 
identity coset, then x] = R for some x € F*. Writing x as ab~! witha and b in 
R shows that al = bR = (b), hence that / is equivalent to a principal ideal. Thus 
the homomorphism C — Z/P is one-one. Finally if M is any nonzero fractional 
ideal of F', then we can find some x € F* with xM C R. Here xM is an ideal 
of R, and the equivalence of M and x M exhibits the class of M in Z/P as in the 
image of C. Consequently C = Z/P, as asserted. 
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The next step in analyzing and generalizing the construction of the p-adic absolute 
value is to pass from the valuation, which appears in the exponent, to the absolute 
value itself. If F is a field, an absolute value on F is a function | - | from F to 
R such that 
(i) |x| => O with equality if and only if x = 0, 

(i) |x + y| < |x| + |y| for all x and y in F, 

(ii) |xy| = |x||y| for all x and yin F. 
It follows directly that 

(iv) | — 1] = |1| = 1 and that 

(v) | —x| = |x| for all x in F. 


In fact, (iv) follows by combining (i) with (iii) for x = y = 1 and then with 
(iii) for x = y = —1; then (v) follows by combining (iii) and (iv). The absolute 
value | - | on F is said to be nonarchimedean if the following strong form of (ii) 
holds:> 


ii’) |x + y| < max(|x|, |y]) for all x and y in F. 


Otherwise it is called archimedean. The inequality in (ii’) is called the ultra- 
metric inequality. When the ultrametric inequality holds, then the following 
additional condition holds: 

(vi) |x + y| = |x| whenever x and y in F have |y| < |x|. 

In fact, when |y| < |x|, Gi’) immediately gives |x + y| < |x|. But also (ii’) and 
(v) give |x| < max(|x + y|,| — y]) = max(|x + yl, |y|). On the right side, the 
maximum cannot be |y| because |x| < |y| is false. Thus |x| < |x + y|, and (vi) 
holds. 

Although it might seem counterintuitive, it turns out that the archimedean 
absolute values are easier to understand than the nonarchimedean ones in the 
number fields and function fields of interest to us. 

Because of (iii), any absolute value of F when restricted to F™ is a multiplica- 
tive homomorphism into the positive real numbers. The image in the positive 
reals is therefore a group. 


EXAMPLES OF NONARCHIMEDEAN ABSOLUTE VALUES. 

(1) Let F be any field, and define |x| = 0 for x = O and |x| = 1 forx #0. The 
result is a nonarchimedean absolute value called the trivial absolute value. It is 
of no interest, and we shall tend to exclude consideration of it from our results. 


>Some authors refer to a nonarchimedean absolute value as a “valuation.” using the same term 
as for the functions v(-) in Section 2. There is little danger of confusing the two notions, but we 
shall use the two distinct names anyway. 


332 VI. Reinterpretation with Adeles and Ideles 


Any other absolute value will be said to be nontrivial. Observe for a finite field 
F that the fact that x ++ |x| is a homomorphism from F™ to the positive reals 
implies that the only absolute value on a finite field is the trivial one. 


(2) Let F be any field, let v be a discrete valuation on F, and fix a real 
number r > 1. Then |x| = r~’® defines a nonarchimedean absolute value 
on F. Property (i) of absolute values follows because v(x) takes values in 
Z U {+00} and is infinite if and only if x = 0, property (ii’) follows be- 
cause v(x + y) > min(v(x), v(y)), and property (iii) follows because v(xy) = 
u(x) + v(y). In particular, the p-adic absolute value is obtained in this way when 
we take r = p, and we obtain corresponding examples for any number field F 
by taking v = vp and fixing r > 1, where P is any nonzero prime ideal in the 
ring of algebraic integers in F’. For the function field F = k(X), we obtain 
corresponding examples by taking v = v,) and fixing r > 1, where p(X) is any 
monic prime polynomial in k(X). The choice v = Ugg gives us another example. 
In all of these cases, the image of F* in R* under the absolute value is discrete 
in the sense that each one-point set of the image is open in the relative topology 
from the positive reals. Corollary 6.17 will show conversely that any absolute 
value for which the image in R* of the nonzero elements is discrete and nontrivial 
is obtained in this way from a discrete valuation. It is worth pausing to interpret 
some of the conclusions of Theorem 6.5 in terms of absolute values and metrics. 


Proposition 6.11. Let R be a Dedekind domain regarded as a subring of its 
field of fractions F’, suppose that | - | is an absolute value on F defined by means 
of a discrete valuation v, and suppose that the subset R, of F for which |x| < 1 
contains R. If P, denotes the subset of F with |x| < 1, then P = RMP, isa 
nonzero prime ideal of R, and also 


(a) R is dense in Ry, 
(b) P” is dense in P” for every n > 1, 
(c) R/P = Ry/Py. 


PROOF. In terms of v, the set R, is the valuation ring, and the set P, is the 
valuation ideal. The hypothesis R C R, is the hypothesis of Theorem 6.5. Part (a) 
of that theorem shows that P = RN P, is a prime ideal in R. Conclusions (a) and 
(b) here follow from Theorem 6.5d. In fact, let |x| = r~’? with r > 1. Suppose 
that x is given in P” with n > 0 and that a positive number r~" is specified. 
We may assume that N > n. The condition for x to be in P.” is that |x| <7”. 
Theorem 6.5d shows that we can find an xo in R such that x9 + y = x with y in 
PN ,hence with |y| < r~™. Then xo is in R and has |xo —x| = |y| < r~. Hence 
Xo is within r—% of x. Since |xo| < max(|x|, |y|) = max(r—",r—“) =r7—", xo is 
in RM P? = P". Conclusion (c) is immediate from Theorem 6.5e. 
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EXAMPLES OF ARCHIMEDEAN ABSOLUTE VALUES. If F is any subfield of R 
or C and if | - | is defined as the restriction to F' of the ordinary absolute value 
function, then | - | is an archimedean absolute value. Remarkably it turns out that 
there are no other archimedean absolute values, apart from “equivalent” ones in 
the sense to be defined below. We return to this matter at the end of Section 4. 
Actually, we shall be interested in archimedean absolute values only when F is 
a number field or is all of R or all of C, and we will not need to invoke any deep 
theorem for the cases of interest to us. 


Properties (i), (ii), and (v) of absolute values show that the function d with 
d(x, y) = |x—y|is ametric on F, and the next section will examine what happens 
when this metric is completed. The resulting fields will be generalizations of the 
field of p-adic numbers and will useful as tools in investigating number fields 
and function fields in one variable. 

Two absolute values | - |, and| - |, on the same field are said to be equivalent 
if there is a positive number @ such that | - |, = (| - |,)*. In our passage from 
a discrete valuation v to a nonarchimedean absolute value | - |, we fixedr > 1 
and defined |x| = r~”“), Changing r changes the absolute value to an equivalent 
absolute value. In the archimedean case a positive power of an absolute value 
need not be an absolute value, since the triangle inequality may fail. For example 
the ordinary absolute value on R satisfies the triangle inequality; so does its a” 
power for a < 1 but not fora > 1. 

Equivalent absolute values yield the same topology on F and in fact the same 
Cauchy sequences.° Conversely two absolute values that yield the same topology 
are equivalent, according to the following proposition. 


Proposition 6.12. Two nontrivial absolute values on a field F are equivalent 
if and only if 
{xeF | xl, > 1} ¢ {xe F| Ixl, > 1}, 


if and only if they induce the same topology on F. 


REMARKS. If | - |, is the trivial absolute value, then the stated inclusion holds 
for all | - |,, but the equivalence may fail; that is why the statement has to exclude 
this case. The statement of the proposition remains true if the inequalities |x|, > 1 
and |x|, > 1 are replaced by |x|, < 1 and |x|, < 1, as we see by replacing x by 
As 


PRooF. If the two absolute values are equivalent, then it is immediate from 
the definition of equivalent that equality holds in the stated inclusion. Conversely 


In many books an equivalence class of absolute values on a field is called a “place” of the field. 
We shall use this term in Sections 9 and 10 of this chapter, 
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suppose that the inclusion holds. Fix x € F with |x|, > 1. Such an x exists 
because | - |, is nontrivial. Since |x|, > 1, there exists a real s > O with 
|x|, = |x|5. We shall show that | - |, =| - |. 

Let y € F be arbitrary with |y|, => 1. Find the number r > 0 depending on 
y such that |y|, = |x|. Let {a,/b,} be a sequence of positive rationals strictly 
decreasing tor such that a, and b, are both positive. Then |y|, = |x|) < Es aa 
from which we obtain |y?"|, < |x|, and |x@y~"|, > 1. By assumption, 
xn y—Pn > > 1, and therefore |y|, < ees Passing to the limit, we obtain 
Iyly < Il}. 

Now suppose that |y|, > 1. Arguing similarly with a sequence of positive 
rationals strictly increasing tor, we obtain |y|, > |x|5. Thus |y|, = |x|,. Then 
we have 


Iyly = bel] = bel’ = Ly} whenever |y|, > 1. (*) 


If instead |y|, = 1, then the number r in the second paragraph of the proof 
is O, and we obtain |y|, < |x|, = 1. Replacing y by y~! shows also that ly], = 1. 
Thus |y|, = 1 implies |y|, = 1. 

The remaining case is that |y|, < 1. Then we apply («) to y° and conclude 
that |y|, = |y|5 inthis case as well. This completes the proof of the first conclusion 
of the proposition. 

For the final statement we know that equivalent absolute values lead to the 
same topology. Conversely suppose that the absolute values are not equivalent. 
By what we have just shown, there exists x € F with |x|, > land |x|, < 1. Then 
{x~"} is a sequence convergent to 0 in the topology from | - |, but not convergent 
to 0 in the topology from | - |,. Therefore the topologies are different. 


1 


Proposition 6.13. If | - | is an absolute value on the field F’,, then the topology 
on F induced by the associated metric makes F into a topological field. 


REMARK. The proof is similar to part of the argument that proves Proposition 
6.1 except that the general triangle inequality has to be used in place of the 
ultrametric inequality. 


PROOF. To see that addition, subtraction, and multiplication are continuous on 
F , let {x,} and {y,} be convergent sequences in F’ with respective limits x and y. 
Use of the triangle inequality on F gives 


ln + Yn) — & + y)| = |On —X) + On — YI S [Xn — X1 +1 ¥n — YI. 


The right side has limit 0 in R, and therefore x, + y, has limitx + yin F. A 
completely analogous argument, making use also of the equality | — 1| = |1|, 
shows that subtraction is continuous. Consider multiplication. If M is an upper 
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bound for the absolute values |x,,|, then use of the multiplicative property of the 
absolute value on F' gives 


IXnYn — XY| = |XnQn — Y) + YOn — X)| < lXnOn — Y)| + lyn — x)| 
= |Xnll¥n — Yl + lyllen — x1 < Ml yn — yl + lyllxn — x1. 


The right side has limit 0 in R, and therefore x,y, has limit xy in F. 

To see that inversion x +> x7! is continuous on F™, let {x,} be a sequence in 
F* with limit x in F'*. Since lim, |x,| = |x|, we can find an integer N such that 
|Xn| = 5 |x| forn > N. The computation 


jog aT =e — xn) / nx) = |x — nl / (nll) < 2b x — onl, 


1 


valid for n > N, then shows that limx,* = x! and inversion is continuous. 


Consequently F is a topological field. 


We now give a few results that limit the kinds of absolute values that can arise 
in particular situations. 


Proposition 6.14. If | - | is an absolute value on the field F for which there 
is some c with |n| < c for all integers n € Z, i.e., for all additive multiples of 1, 
then | - | is nonarchimedean. In particular, | - | is necessarily nonarchimedean if 
F has characteristic different from 0. 


REMARK. When c exists, then c can be taken to be 1, since the image of F’” 
under the absolute value is a subgroup of the positive reals and the only bounded 
such subgroup is {1}. 


PRooF. If x and y are in F and if n is any positive integer, then the Binomial 
Theorem gives (x + y)" = )04_o Car y/. Therefore 


n : . 
ty" => [Gert 
jJ= 


n 
<c )) max((x|, |y|)"-/ max(|x], |y|)/ 
j=0 


= e(n +1) max(\x|, |y)". 


Extraction of the n™ root gives |x + y| < c!/"(n + 1)!/" max(|x|, |y|). Passing 
to the limit, we obtain |x + y| < max([x|, |y|). 
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Theorem 6.15 (Ostrowski’s Theorem). If | - | is a nontrivial absolute value 
on the field Q, then | - | is equivalent either to the p-adic absolute value | - | p for 
some prime number p or to the ordinary absolute value | - |p. 


REMARKS. No two of these are equivalent because {p”} tends to 0 relative to 
the p-adic absolute value, {p~”} tends to 0 relative to the ordinary absolute value, 
and p” has absolute value 1 relative to the €-adic absolute value for all prime 
numbers ¢ + p. 


PROOF. First suppose that every integer n has |n| < 1. Proposition 6.14 shows 
that | - | is nonarchimedean. Since | - | is nontrivial, we must have |n| < 1 for 
some n, and we may take n to be positive. Since |n| is the product of |p| over 
all primes dividing n, multiplicities included, some prime number p has |p| < 1. 
Let us see that p is unique. If, on the contrary, |g| < 1 fora second prime number 
q, choose integers a and b with ap + bq = 1. Then 1 = |1| = jap + bq| < 
max(|ap|, |bg|) = max(|a||p], |b\|g|) < max(|p], |g|) < 1, contradiction. If we 
now define a positive real a by |p| = p-“, then it follows that |n| = (\n|,)* for 
all integers n. Therefore | - | = (| - Lae on all of Q. 

Now suppose that n is some integer with |n| > 1. We may assume that n is 
positive. For any positive integer m, the triangle inequality gives 


Jm| =[1+--- +1) <lll+--- +N) =m. 


In particular we have |n| = n% for some real w withO <a <1. 
We shall prove that 
|m| < m* (*) 


for all positive integers m. We start by expanding m to the base n, writing 


m=co ten toon? +++» +e yn, 


where k is the integer such that n'~! < m < n* and where each c; satisfies 
0 <c; <n. The triangle inequality gives 


2 k-1 
|m| < |co| + ler||n| + [eal|al" + +++ + lexan 


Sins at et ee oe > by definition of a 
_ Ine _ = Wn ean 
ne —] n* —1 
—1 a 
< Or since nk! < m. 
oe 


In other words, there is a positive number C independent of m such that |m| < 
Cm* for every positive integer m. For every positive integer N, we then have 


3. Absolute Values 337 


|m|" = |m%| < Cm, and thus |m| < C!/"m®. Letting N tend to infinity, we 
obtain (+). 
Let us now improve (*) to the equality 


|m| = m® for every positive integer m. (+) 


1 k 


The integer k above has nk < m < nk. Put d = n* — m; this satisfies 


0 <d <n* —n'~!. Then 
n* = |n| = |n*| < || + (d| < ml +d® < |m| + (ak — a1), 


and consequently 


|m| = n® — (nk —nk)* = n%*(1 — (1 — 4)*) = m2(1 — (1 - 2)”). 
Thus |m| > C’m® for some positive constant C’ independent of m. For every 
positive integer N, we then have |m|% = |m%| > C’m®™ and hence |m| > 
C/N m*. Letting N tend to infinity, we obtain |m| > m®. In combination with 
(*), this proves (+). 

Since | — m| = |m|, the equality (**) implies |m| = (|m|p)“ for every integer 
m. Taking quotients, we obtain |g| = (|¢|,)* for every rational q. 


Corollary 6.16. If | - | is a nontrivial absolute value on a number field F’,, then 
the restriction of | - | to Q is nontrivial. 


REMARK. In view of Ostrowski’s Theorem (Theorem 6.15), the restriction to 
Q therefore has to be equivalent to the p-adic absolute value for some p or to the 
ordinary absolute value. 


PROOF. Since | - | is nontrivial, there exists x with |x| > 1. Raising x to 
a power if necessary, we may assume that |x| > 2. Arguing by contradiction, 
suppose that |g| = 1 for all nonzero q in Q. Since x is algebraic over Q, there 
exist an integer n > 1 and rational coefficients g,_1, ..., qo such that 


Seg ess gin go: 


Applying | - | to both sides and using that |g;| < 1 for all 7 gives 


Ix!" < |x|" 1 4+---+|x]/+1= ea ee 


the right-hand inequality holding because |x| > 2. We have thus obtained |x|" < 
|x|” — 1 and have arrived at a contradiction. 
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An absolute value | - | on a field F such that the image of F~ is discrete is 
called a discrete absolute value. The p-adic absolute values on Q and on Q, 
furnish examples. 


Corollary 6.17. If | - | is a nontrivial discrete absolute value on the field F, 
then | - | is nonarchimedean, and |x| = r~’) for some discrete valuation of F’. 


REMARKS. Example 1 of nonarchimedean absolute values shows that discrete 
valuations always lead to discrete absolute values. This corollary is a converse. 
The trivial absolute value is of course nonarchimedean, but it does not arise from 
a discrete valuation. We shall not be interested in any nonarchimedean absolute 
values that do not arise from discrete valuations. 


PROOF. First we show that | - | is nonarchimedean. Proposition 6.14 imme- 
diately handles the case that F has nonzero characteristic, and we may therefore 
take the characteristic to be 0. Let D be the discrete image subgroup of F *. This 
D in particular must contain the image of Q*. Meanwhile, Theorem 6.15 says 
that the restriction of | - | to Q has to be trivial, or equivalent to the p-adic absolute 
value for some p, or equivalent to the ordinary absolute value. Under the ordinary 
absolute value, the image of Q* cannot be contained in D, and the restriction 
must be one of the other kinds. For all of the other kinds, the image of Z is 
bounded, and Proposition 6.14 allows us to conclude that | - | is nonarchimedean. 

Now that | - | is nonarchimedean, we set v(0) = +oo and v(x) = — log, |x| 
for x # 0. Properties (i), (ii’), and (iii) of nonarchimedean absolute values 
immediately imply the three defining properties of a discrete valuation. 


Corollary 6.18. If | - | is anontrivial discrete absolute value ona field F’, then 
the corresponding valuation ring R = {3 eF | |x| < 1} and the valuation ideal 
P= {x € F | |x| < 1} are open and closed in F. 


REMARK. Corollary 6.17 shows that | - | is defined by a discrete valuation. 


PROOF. The definitions of R and P in the statement show that R is closed 
and P is open. Let D be the image of F* under | - |. A discrete subgroup 
of positive reals has to be equal’ to {1} or to the subgroup r” for a unique real 
r > 1. The nontriviality of | - | implies that the correct alternative is r“. Then 
the equality R = {x e F | |x| < r} shows that R is open, and the equality 
P= E: eF | |x| < ro} shows that P is closed. 


Next we prove a general result applicable to number fields and to function 
fields in one variable that yields the conclusion that nonarchimedean absolute 
values in these cases are automatically discrete. The general result is obtained in 
two parts, stated as Lemma 6.19 and Proposition 6.20. 


7One can invoke Lemma 5.14, for example. 
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Lemma 6.19. If R is a Dedekind domain regarded as a subring of its field of 
fractions F’, and if | - | is a nonarchimedean absolute value on F that is < 1 on 
R, then | - | is discrete. Hence either | - | is trivial or else it is defined by the 
valuation relative to a nonzero prime ideal of R. 


PROOF. The subset of x € R for which |x| < 1 is a proper ideal J in R, and 
we let P be a prime ideal containing J. Since R is a Dedekind domain, P defines 
a corresponding discrete valuation vp. Let |x|,» = 27”? ©) Then 


{x eR] lx] <1J=1CP={xeER| |x|p <1}, 


and hence 
(tER| kip =1) elxeR| l=}. (x) 


Let z be an element of R with |z|p = 5. If x is an arbitrary nonzero member of 
F with |x|p < 1, then Proposition 6.4 shows that we can write x = m*x' with 
k > 0,x’in R,and |x’|» = 1. Then |x’| = 1 by («), and it follows that |x| = |zr|*. 
Since |x|p = bab also, there are only two possibilities. One possibility is that 
|x| = || = 1 for all x 4 0, and then | - | is trivial. The other possibility is that 
the subsets of F for which |x| < 1 and for which |x|p < 1 coincide. In this case 
we apply Proposition 6.12 and conclude that | - | and | - |p are equivalent. 


Proposition 6.20. Let R be a Dedekind domain regarded as a subring of its 
field of fractions F, let K be a finite algebraic extension of F’, and let T be the 
integral closure of R in K. If | - | is anonarchimedean absolute value on K that 
is < 1 on R, then itis < 1 on T. Hence | - | is discrete, and either | - | is trivial 
or else it is defined by the valuation relative to a nonzero prime ideal of T. 


PROOF. As with Proposition 6.7, T is a Dedekind domain. If x 4 0 is in T, 
then the minimal polynomial of x over R is a monic polynomial in R[X], and 
thus there exist an integer n and coefficients ad,_1,...,d@0 in R such that 


4" ag, ax! Bees wie ap. 


Taking the absolute value of both sides and using the nonarchimedean property, 
we obtain 


|x|” << max (la;||x|/) < max (|x|/) = max(1, |x|"7'), 
O<j<n-l * 0<j<n-1 


the inequality holding because | - | is assumed to be < 1 on R. If we could have 
|x| > 1, then this inequality would read |x|" < |x|"~!, which is a contradiction. 
We conclude that |x| < 1 for all x € T. The conclusions in the last sentence of 
the proposition now follow from Lemma 6.19. 
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Corollary 6.21. If K is a number field, then every nontrivial nonarchimedean 
absolute value | - | on K comes from the valuation vp relative to some nonzero 
prime ideal P in the ring of algebraic integers in K. 


REMARK. Proposition 6.27 below will classify the archimedean absolute values 
on a number field. 


PROOF. Since | - | is nonarchimedean, its restriction to Q is nonarchimedean. 
By Ostrowski’s Theorem (or by inspection), it is < 1 on Z. The result now 
follows from Proposition 6.20 if we take R to be Z and F to be Q. 


Corollary 6.22. Let k be a field, let F = k(X) be the field of rational 
expressions in one indeterminate over k, let K be a finite algebraic extension of 
k[X], let T be the integral closure of k[X] in K, and let | - | be a nontrivial 
nonarchimedean absolute value on K that is 1 on the multiplicative group k*. 
Then | - | is discrete, and the only possibilities for it are as follows: 


(a) |X| < 1, and there exists a unique nonzero prime ideal P in T such that 
| - | comes from the valuation determined by P, 

(b) |X| > 1, and there exists a prime ideal P in the integral closure T’ of 
k[X—!] in K such that PQ k[X~!] = X~!k[X~!] and such that | - | 
comes from the valuation of K determined by P. 


REMARKS. As with Proposition 6.7, T and T’ are Dedekind domains. If 
k has nonzero characteristic, then Proposition 6.14 shows that every absolute 
value is nonarchimedean. For the case that k has characteristic zero, remarks at 
the end of Section 4 will indicate why every absolute value that is 1 on k* is 
nonarchimedean; we shall not need to make use of this fact, however. In any 
event, just as with Corollary 6.10, the ideals P that occur in (b) are the ones in 
the prime factorization of the ideal X —!T’ in T’; there is at least one, and there 
are only finitely many. 


PROOF. The argument is similar to the one for Corollary 6.21, except that we 
have to take into account what happens when |X| > 1. We apply Proposition 
6.20 either with R = k[X] or with R = k[X7~!}. 

Since | - | is 1 on k*, an inequality |X| < 1 implies that | - | is < 1 on kLX], 
| - | being assumed to be nonarchimedean. Then Proposition 6.20 and Corollary 
6.10 show that (a) holds. Similarly an inequality |X| > 1 implies that | - | is < 1 
on k[X~'] because | - | is assumed nonarchimedean. Then Proposition 6.20 and 
Corollary 6.10 show that (b) holds. 


Theorem 6.23 (Weak Approximation Theorem). Let | - |,,...,| + |, be 
inequivalent nontrivial absolute values on a field F. If € > 0 is a real number 
and x1,...,X, are elements of F’, then there exists y in F' such that 


ly —xj|; <€ forl <j <n. 
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REMARKS. The special case of this theorem in which F is a number field and 
the absolute values are defined by n distinct nonzero prime ideals in the ring of 
algebraic integers follows from the Chinese Remainder Theorem (Theorem 8.27 
of Basic Algebra, restated in the present book on page xxiii). In fact, it is enough 


to handle the case that all the x;’s are algebraic integers in F’. Let the prime ideals 


be P;,..., Py», and let | - |, = ee with r; > 1. If we specify any positive 


integers ki, ..., k,, then the Chinese Remainder Theorem produces an algebraic 
; ; kj ; 
integer y in F such that y = x; mod ae for 1 < a <n. These congruences say 


that vp,(y — xj) = kj, hence that |y — x;|; < re . Thus we have only to choose 


ki,...,k, large enough to make rj 
theorem will hold. 


< € for all j, and the inequalities of the 


PROOF. First let us prove that we can find an element z in F with 
Iz], >1 and zl; <1 for2 <j <n. (*) 


We do so by induction on n, the case n = 2 being Proposition 6.12. Assuming 
the result for n — 1, find u with |u|, > 1 and lul; < |for2 < j <n—1. Then by 
the result for n = 2, find v with |v|, > 1 and |v|,, < 1. Let k > O be an integer 
to be specified, and put 


v if |u|, <1 
z= d ukv if |u|, =1 
k : 

ae if |u|, > 1 


In the second case, k is to be chosen large enough to make jue lvl, < 1 for 
2 < j < n-—1. In the third case, k is to be chosen large enough to make 
uF + [ul ol, > 1, Wd — lal) 'el; < 1 for 2 < j <n —1, and 
Ju |K (\ue|é — 1) |v, < 1. Then z satisfies the conditions in («), and the inductive 
proof of () is complete. 

Applying (+), find z; such that |zj|; > 1 and |zj|; < 1 fori # j. Let/ bea 
positive integer to be specified, and put 


a xz! 
— 
y= Lid 1+z; 


Since y — xj = —xj +2)) 1 + Diz, mizi(l + 2}) +, we obtain 
ly — xl) < belles -— + Xba [(izith — zii-!). Gee) 


For / large enough, the coefficients (|z; iF —1)7! and |z; nel — |Z; yt fori 4 j 
can be made as small as we please, and thus the right side of (**) can be made to 
be <e. 
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In this section we finish our project of establishing an abstract theory that gener- 
alizes the construction of the field of p-adic numbers. A little care is appropriate 
in stating the results. Here is an example of the cost of imprecision: We know 
that the field Q, is obtained by completing Q with respect to the p-adic absolute 
value. We shall see in Section 5 that Q, for p = 5 is obtained also by completing 
the field Q(i) with respect to a certain absolute value and that in fact there are 
two distinct equivalence classes of absolute values on Q(i) for which Qs results 
in this way. Thus a completion process is not well specified unless we include all 
the data—the original field, the absolute value on it (or at least the equivalence 
class of absolute values), and the mapping into the completed space. 

For this reason we introduce the notions of a valued field, namely a pair 
(F, | - |) consisting of a field and an absolute value on it, and a homomorphism 
of valued fields. If (F, | - |,,) and (K, | - |,) are the two valued fields in question, 
a homomorphism from the first to the second is a field map g : F — K such 
that |x| - = |g(x)|x for all x in F. We write y* for the corresponding operation 


of restriction: g*(| - |x) =| - |. If g carries F onto K, then ¢ is called an 
isomorphism of valued fields. 
A completion of a valued field (F’,| - |,-) is defined to be a homomorphism 


of valued fields y : (F,| - |) > (K,| - |) such that (K, | - |,) is complete as 
a metric space and g(F’) is dense in K . The first theorem establishes existence. 


Theorem 6.24. Let F be a field with a nontrivial absolute value | - |;, let 
d be the associated metric on F, let ® be the subring of Ws F consisting of 
all Cauchy sequences relative to d, and let Z be the ideal in FR consisting of all 
sequences convergent to 0. Then Z is a maximal ideal in R, and the quotient R/Z 
is a field. Consequently the Cauchy completion of F relative to d is a topological 
field F = R/T. Leti : F + F be the natural map F > R — R/T of F into the 
Cauchy completion given by carrying members of F into constant sequences in, 
followed by passage to the quotient. The metric d on the Cauchy completion is the 
unique continuous function d : F x F > Rsuch that d(i(x), i(y)) =d(a, y). If 
areal-valued function | - | is defined on F by Ixl-= d(x,0)forx € F,then|- le 
is an absolute value on F, andi : (F,| - ln) > CF; lz) is a homomorphism 
of valued fields. Moreover, the absolute value on F is nonarchimedean if the 
absolute value on F is nonarchimedean. 


REMARKS. The usual construction of the Cauchy completion embeds the 
original metric subspace as a dense subset of a complete metric space, and 
therefore this theorem is showing thati : (F,|-|,-) > (F,|- lz) is acompletion 


of (F,| + |p). 
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PROOF. The proof of this theorem is almost the same as the first part of the 
proof of Proposition 6.1, apart from notational changes. The differences occur in 
spots where the ultrametric inequality was invoked in the proof of Proposition 6.1 
and only the triangle inequality is available here. The main such difference is the 
argument that the validity of the triangle inequality on F implies the validity of the 
triangle inequality on F’,and we give that argument ina moment. Correspondingly 
it is unnecessary for us to prove that the validity of the ultrametric inequality on 
F implies the validity of the ultrametric inequality on F, because that argument 
does occur in the proof of Proposition 6.1. 

The other places in the proof of Proposition 6.1 where the ultrametric inequality 
was used are in the proof that the completion is a topological field. It is not 
necessary to modify that proof here, however, since we can invoke Proposition 
6.13. 

Thus let us see that the validity of the triangle inequality on F implies the 
validity of the triangle inequality on F. To proceed, let x and y be members of 
F = R/T, and let {qn} and {r,} be respective coset representatives of them in R. 
Then {gn + 7n} is a representative of x + y, by definition, and the continuity of 
| - ly on F implies that limy |¢n + Tal p=lx+yl,. From this limit formula and 
the triangle inequality for F, we obtain 


|x + Vilas = lim |¢n Pn ls ss lim sup(|gn| z+ I'nl=) 
n 


= lim sup |¢n|z + lim sup |rnlz = IXl—e+ lylz> 
n n 


since limy |¢n le = |x |; and lim, |rn le = | Vie This proves the triangle inequality 
on F. 


A valued field (L,| - |,) is said to be complete if L is Cauchy complete in 
the metric defined by | - |, . In Section 6 we shall make crucial use of a universal 
mapping property of the completion of a valued field. 


Theorem 6.25. If i: (F,| - |) > (K,| + |x) is a completion of the valued 
field (F,| - |) andifg : (F,| - |) > (L,|- |,) isa homomorphism of valued 
fields with (L,| - |,) complete, then there exists a unique homomorphism of 
valued fields ® : (K,| + |x) > (L,| - |,) such that g = Pou. 


REMARKS. As usual with universal mapping properties, this theorem implies 
a uniqueness result: any two completions of a valued field are canonically iso- 
morphic. It is not necessary to write out the details. Making a small adjustment 
to the proof below, we see also that if a field has two equivalent absolute values 
on it, then the corresponding two completions are canonically isomorphic by a 
field map that respects the topologies. 
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PROOF. The theory of completion of a metric space produces a unique con- 
tinuous function ® : K — L such that g = ® o 1, and this continuous function 
respects the metrics. It is necessary to check only that ® respects addition and 
multiplication. 

The argument is the same for the two operations, and we check only addition. 
Let x and y be given in K, and choose sequences {x,} and {y,} in F with 
lim t(x,) = x, limt(y,) = y. Since addition is continuous in K , lim (x, + yn) = 
x + y. Since ® is a continuous function with g = ® ot, 

P(x) + P(y) = Pilime(x,)) + Odlime(yn)) 
= lim(®Cn))) + lim(® (yn) = lim(Y@n)) + lim(Y(yn)) 
= lim(9(Xn) + Gn) = Lim(@n + Yn)) 
= lim(®e(Xn + yn) = O(lim e(xy + yn) = O@ +), 
and © respects addition. 


Theorem 6.24 generalizes the parts of Proposition 6.1 concerning Q, but not 
those concerning Z,. The arguments concerning Z, transparently made use of 
the ultrametric inequality, and they used a little more. The extra fact used is 
that the p-adic absolute value is defined from a discrete valuation. In view of 
Corollary 6.17 and Example 1 of nonarchimedean absolute values in the previous 
section, a necessary and sufficient condition for a nontrivial absolute value on a 
field F to be obtained from a discrete valuation is that the image of F* under 
the valuation be a discrete subset of the positive reals. Such an absolute value is 
automatically nonarchimedean. 


Theorem 6.26. Leti: (F,| + |) > CPi IZ) be a completion of a valued 


field, and suppose that | - |,, is nontrivial and discrete. Let v(-) be the discrete 
valuation that defines | - | on F’. Then 
(a) the image |F™ ly equals the image |F'*|,,, and | - lz on F is therefore 
defined by a discrete valuation v(- ) on F such that dol =v, 
(b) the image ¢(R) of the valuation ring R of v is dense in the valuation ring 
R of i, 
(c) for every integer n > 0, the image 1(P”) of the n™ power P” of the 
valuation ideal P of v is dense in the n™ power P’ of the valuation ideal 
P of v, 
(d) the residue class fields of F and F coincide in the sense that the mapping 
i: R > R descends to a field isomorphism of R/P onto R/P, 
(e) for every integer n > 0, the mapping: : R — R descends to a ring 
isomorphism of R/P” onto R/P”, 
(f) R is compact if R/P is finite, and in this case the topological field F is 
locally compact. 
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REMARK. No assertion is made in (d) and (e) about whether the topologies 
match under the constructed isomorphisms. Our interest will be mostly in the case 
that R/P is finite, in which case the topologies match because they are discrete. 


PROOF. Write |F*|, in the form r” for a unique real number r > 1. For 
(a), since KO z = |x|, and since 1(/’) is dense in F, the continuity of the 
absolute value | - le implies that the image of F * is contained in the closure of 
r” within the positive reals, which is 7”. The formula v 01 = v follows from the 
computation r~"@) = Ix|p = IL )lz = r~*“@)) by taking the logarithm to the 
base r. 

For (b) and (c), we use that ((F’) is dense in F’, and we treat (b) as the casen = 0 
of (c). Fix n > 0 and consider P”. Choose a sequence {x;,} in F with {u(x;,)} 
converging to a point x in P”. Since |x|= < r~", we must have |x;|, < poe} 
for all sufficiently large k. The elements x, satisfying this condition are in P”, 
and thus .(P”) is dense in P’. 

For (d) and (e), the mapping R > R/P" descends to R/P", since .(P) C P. 
The descended map is one-one, since if x € R maps to the 0 coset, then x is in 
i1(P") = P". To see that the descended map is onto, let a coset x + P" be 
given. Since 1(R) is dense in R, we can choose x € R with |1(x) — xl5 ipo, 
Since P” = {y € F |ly| < per. u(x) — xX isin P”. Hence u(x) is exhibited 
as in ¥ + P’”, and the coset x + P” maps to the coset x + P’. 

In (f), Corollary 8.60 of Basic Algebra shows that P” / P”*! is a 1-dimensional 
vector space over R/P. The First Isomorphism Theorem gives an R module 
isomorphism (R/P"t!)/(R/P") = P"/P"*!, and it follows by induction on n 
that the finiteness of R/P implies the finiteness of R/P”. In view of (e), R/P" 
is finite for every n > 0. 

For each n > 0, the set R is covered by the cosets of P', which are closed 
balls in F of radius r~” and open balls of radius r~"*!. Thus for any positive 
radius, there exists a finite collection of open balls of that radius or less such that 
the union of the open balls covers R. This means that R is totally bounded in the 
metric space F. A totally bounded closed subset of a complete metric space is 
compact, and consequently R is compact. 

Thus the 0 element of F has R as a compact neighborhood. Since addition is 
continuous, each member x of F has x + Rasa compact neighborhood of x, and 
therefore F is locally compact. 


Let us review briefly. We start with an absolute value on a field F. The 
cases of initial interest are that F is a number field or is a function field in one 
variable, namely a finite algebraic extension of a field k(X), where k is a given 
base field; in the latter case we assume that the absolute value is identically 1 
on k*. A number field can have archimedean absolute values, and we come 
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to them in a moment. In the function-field case we know that every absolute 
value is nonarchimedean if k has nonzero characteristic; this remains true for 
characteristic zero but we did not prove it. For our cases of interest the nonar- 
chimedean nontrivial absolute values are always given by a discrete valuation. 

Thus let us summarize what happens for a nonarchimedean nontrivial absolute 
value that is given by a discrete valuation. Within the given field F we have 
singled out a Dedekind domain R for which F is the field of fractions,* and the 
absolute value is < 1 on R. For example, in the number-field case R is the ring 
of algebraic integers in F’. In all cases the discrete valuation v is determined by a 
nonzero prime ideal p of R, and the absolute value on F is given by |x|; =r7? 
for some number r > 1. Our two-step process consists in a step of localization 
and a step of completion. The step of localization passes to the principal ideal 
domain S~!R with maximal ideal S~'p, where S is the complement of p in R. 
The domain S~'R coincides with the valuation ring of v, and the ideal S~'p 
coincides with the valuation ideal of v. The absolute value on F does not change 
during this process of localization. The ideal S~'p is principal in S~!R, say with 
jt as a generator. The element z can be chosen to be in p, and it has v(z) = 1. 
Theorem 6.5 and Proposition 6.11 govern relationships between R and S~'R. 
Briefly the powers of p are dense in the powers of S~'p, and the natural map of 
residue class fields R/p > S~'R/S~'p is a field isomorphism onto. 

The second step is a step of completion with respect to the absolute value. 
The completion of a valued field (F, | - |,-) is ahomomorphism of valued fields 
t:(F,| + |p) > (L,| - |,) such that (L,| - |,) is complete as a metric space 
and 1 carries F onto a dense subfield of L. This exists by Theorem 6.24. In 
the situation with a nonarchimedean nontrivial absolute value that is given by a 
discrete valuation, one often writes F, for the completed field L. The eventual 
interest is partly in what happens to R and p, but we first consider S~'R and 
S~'p. The completed absolute value | - | Fy is given by a discrete valuation v 
with vol = v. Let us write Ry for its valuation ring and py, for its valuation 
ideal. Theorem 6.26 governs the relationships between S~'R and Ry. Briefly 
the images under : of the powers of S~'p are dense in the powers of p p, and the 
natural map of residue class fields S~!R/S7!p > Ry/Pp» induced by « is a field 
isomorphism onto. 

The case of most interest for number theory is the case of a number field F and 
the absolute value determined by a nonzero prime ideal p in the ring of algebraic 
integers of F’. The field Fy is called the field of p-adic numbers, and the ring 
Ry is called the ring of p-adic integers. When F = Q and p = pZ for a prime 
number p, the element z can be taken to be p. 


8The case R = F is excluded; this is the case that produces the trivial absolute value, which 
does not interest us. 
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In the case of a function field in one variable that is most analogous to a 
number field, one starts from a field F that is a finite algebraic extension of 
F(X), where FF, is a finite field with g elements. According to Corollary 6.22, 
all but finitely many of the nonarchimedean absolute values are defined in terms 
of nonzero prime ideals in the integral closure of F,[X] in F’; the others are 
the prime constituents of the ideal X TR [X~!] in F,[X ~!]. One can show that 
the ring in the completion analogous to Ry is always a ring of formal power 
series F,,[[X]] in one indeterminate X and with coefficients in a finite extension 
F, of F,. Elements of this ring are arbitrary formal power series of the form 
ys c,X* with all c, in F,. The field of fractions analogous to Fy is always a 
field of formal Laurent series IF, ((X)) in one indeterminate; nonzero elements 
of this field are arbitrary expressions of the form Se Nn CkKX K with all c, in Fy’, 
with c_y 4 0, and with N depending on the element. 


Let us now examine archimedean completions. We shall discuss what happens 
when we start from a number field, and then we make some remarks without proof 
about the general case. Thus let F be a number field, and let an archimedean 
absolute value be given on it. To have notation parallel to the nonarchimedean 
case, it is customary to index the absolute value” by a symbol like v, writing | - |, 
for it. Corollary 6.16 shows that the restriction of | - |,, to Q is nontrivial, and the 
combination of Proposition 6.14 and Ostrowski’s Theorem (Theorem 6.15) shows 
that the restriction to Q is equivalent to the ordinary absolute value. Adjusting 
| - |, within its equivalence class, we may assume that its restriction to Q matches 
the ordinary absolute value. Using Theorem 6.24, we form the completion of F 
with respect to | - |,,, writing F, for the completed space. The limits of Cauchy 
sequences from Q itself show that R lies in the completed space, since | - |,, 
matches the ordinary absolute value on Q. Thus we can regard R as a subfield 
of Fy, and F is a subfield as well. Consequently the set RF of sums of products 
is a subring of F',. The multiplication mapping of R x F into F, is Q bilinear 
and has a linear extension R @g F — F, whose image is RF’. The R dimension 
of R @g F is [F : Q], and consequently the R dimension of RF is < [F : Ql], 
hence finite. Being a finite-dimensional R algebra embedded in a field, RF is a 
subfield! of F,. It is therefore a finite algebraic extension of R and must be R 
or C. Thus F lies in R or C. The fields R and C are complete relative to the 
ordinary absolute value, and hence RF is aclosed subset of F,. Since F is dense, 
we conclude that Fy is R or C. 

Visualize having a standard copy of C available, with R embedded in it. From 
the above remarks, any archimedean absolute value of the number field F,, after 


°Or the equivalence class of the absolute value. 
!OWithin a field if a nonzero element is algebraic over a base field, then the smallest ring containing 
the base field and the element contains also the inverse of the element. 
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adjustment within its equivalence class, yields a completion that takes one of the 
two forms 


o:(F,|-l)>Q@Il-D and o: (| -1)>Gl- dD, 


where | - | is ordinary absolute value on R or C. Conversely any field mapping o 
of F into R or C has dense image either in R or in C and defines an archimedean 
absolute value on F by | - |, =o*(| - |). Theno: (F,|- |,) > (RorC,| - |) 
is acompletion by Theorem 6.25. 

To classify the archimedean absolute values up to equivalence, we recall from 
Section V.2 that the number of distinct field maps o into C of a number field 
F of degree [F : Q] = n is exactly n, with a certain number r, of them having 
image in R and with the remainder 2r2 having image in C but not R and occurring 
in complex conjugate pairs. Each such field map o gives us a completion. The 
members of a complex conjugate pair result in the same absolute value on F when 
the ordinary absolute value of C is restricted to F. We shall show that there are 
no other equivalences. 


Proposition 6.27. Let F be a number field with [F : Q] = n, and let there 
be r; distinct field maps of F into R and rz complex conjugate pairs of distinct 
field maps of F into C, with r; + 2r2 = n. Each such field map o induces an 
archimedean absolute value on F by restriction from R or C, the only equivalences 
are the ones from pairs of field maps related by complex conjugation, and the 
resulting collection of 7; + r2 absolute values exhausts the archimedean absolute 
values on F’, up to equivalence. 


PROOF. The remarks above show everything except that these 7; + rz absolute 
values are mutually inequivalent. To prove this fact, suppose that o and o’ are two 
field maps of F into the same field, R or C, such that x > |o(x)| is equivalent 
to x +> |o/(x)|. Then gy = o’a7! is a field isomorphism from image o onto 
image o’ that respects the absolute value, up to a power. It is therefore uniformly 
continuous from image o onto image o’. Consequently g extends to all of Ror C, 
and the continuous extension respects the field operations. On Q, 9 is the identity, 
and hence its continuous extension to R must be the identity. Thus the continuous 
extension is an automorphism of R or C that fixes R, and consequently it must 
be the identity or complex conjugation. 


It is of some interest to know what archimedean absolute values can occur in 
other situations, besides number fields, and Theorem 6.24 shows that it is enough 
to classify the complete ones. Ostrowski did so, and the result is that R and C, 
with their ordinary absolute values, are the only complete archimedean fields up 
to equivalence.!! 


'l 4 proof of the Ostrowski result may be found in Hasse’s Number Theory, pp. 191-194. Gelfand 
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5. Hensel’s Lemma 


Hensel’s Lemma is a device that in its simplest forms allows one to solve polyno- 
mial equations in the field Q,, of p-adic numbers by using congruence information 
modulo some power of p. It has a number of distinct formulations, all of which 
work within any complete nonarchimedean valued field, not limited to Q,. We 
shall give a fairly simple formulation and obtain a handy special case as a corollary, 
using an adaptation of Newton’s method of iterations in calculus for finding roots 
of polynomials. At the end of the section, we shall state without proof a version of 
Hensel’s Lemma that works to factor polynomials rather than to find their roots. 
Yet another formulation of Hensel’s Lemma, whose precise statement we omit, 
applies to systems of polynomial equations in several variables. 

No overarching result of this chapter actually makes use of any version of 
Hensel’s Lemma. Instead, versions of Hensel’s Lemma are indispensable in 
analyzing the fine structure of complete valued fields and in handling examples. 
Thus the applications of Hensel’s Lemma in this book will occur in the examples of 
this section and the next and also in problems at the end of the chapter. Problem 16 
is one such problem. 


Theorem 6.28 (Hensel’s Lemma). Let F' be a field with a nontrivial discrete 
absolute value | - |, necessarily nonarchimedean, and assume that F is complete. 
Let R be the valuation ring, and let f(X) be a polynomial in R[X]. Suppose that 
ao is amember of R such that 


FAC Ce 
Then the sequence {a,,} recursively given by 


FQ) 
f'(@n) 


an+1 = an 


is well defined in R and converges to a root a of f (X) that satisfies |a — ao| < 1. 


PROOF. Put c = |f(ao)|/|f/(ao)|?7 < 1. We prove the following three 
statements together by induction on n: 


(i) Gp is well defined and is in R, 
(ii) |f’(an)| = |f’(ao)| A 0, and 
(ii) | f Gn)I/N Ff’ Gn)| < c7 |f'(ao)I. 


and Tornheim proved a more general result, with the same conclusion, that allows the multiplicative 
property of absolute values to be relaxed somewhat. A proof of this result appears in Artin’s Theory 
of Algebraic Numbers, pp. 45-51. 
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The base case for the induction is the case n = O, and the three statements are 
true by hypothesis in this case. 

Assume that the three statements hold for n. From (ii), dy+ is defined, and 
then (iii) shows that a,+1 satisfies 

ii’) |ang1 — Gal = If nif! Gn)| < cf’ @o)I. 
The fact that a, and f’(ao) are in R, in combination with (iii’), shows that ay+1 
is in R. This proves (i) for n + 1. 

For (ii) and (iii), we make use of the following Taylor expansions of f (X) and 
f'(X) about b: 


F(X) = f(b) + (X — b) f(b) + (X — bye (X) with g(X) € R[X] 
and 


f (X= fo) + (XK —b)h(X) with h(X) € R[X]. 


To check that these expansions are valid in any characteristic, it is enough to 
check the first one, since the second one follows by differentiation. For the first 
one, it is enough to treat the special case X*. Dividing X* — b* by X — b, we see 
that we are to produce g(X) such that 


k=l k-1 
(X — b)g(X) = IT SX) —kbk 1 = DEX) — b). 
j=0 j=0 


Every term on the right side is divisible by X — b, and thus the quotient g(X) is 
in R[X]. 

Put Qn = 4n41 — Mn = —f (Gn)/f' (an). By (ili) for n, 1Qn| < If’ (an)|c""; 
in particular, |Q,| < | f’(a,)|. In the expansion of f’(X), we take b = a, and 
evaluate at X = a,+, to obtain 


f' (Qn41) — f'(an) oF Onh(Gn41)- 
Since |Qn| < |f’(dn)| and |A(an41)| < 1, we see that | f’(@n41)| = |f'(an)I. 


This proves (ii) forn + 1. 
In the expansion of f(X), we take b = a, and evaluate at X = a,+, to obtain 


Ff @n41) = f Gn) + Gri — an) f'n) + Gn4i — Gn)? 8 (Gn41)- 
But (Gn41 — Gn) f' (An) = —f (an), and hence this equation simplifies to 


f Gn41) = O78 (an41)- 
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Since g(a@n+1) is in R, application of (iii) for n and (ii) for n + 1 gives 


If@n+01 _ 1Qnl?1¢@n+1)! ee ( If (an)| y 2 ae" 


LFi@aae °° fit, ~~ Sift? 


and this proves (iii) form + 1. This completes the induction. 
Now we can prove the theorem. If < m, then (iii’) and the ultrametric 
inequality imply that 


2k 2" 
lam — An| < max lag4) — axl <|f'(ao)| max c” <|f'(ao)\c”.  (*) 
n<k<m n<k<m 


Consequently {a,} is a Cauchy sequence. Let a be its limit. Substituting 
into the definition of a,+1, using (ii), and passing to the limit, we obtain a = 
a — f(a)/f'(a). Thus f(a) = 0. Taking n = 0 in (x) and letting m tend to 
infinity gives |a — ao| < | f'(ao)|c, and this is < c < 1 because f’(ap) is in R. 


Corollary 6.29 (Hensel’s Lemma). Let F be a field with a nontrivial discrete 
absolute value, necessarily nonarchimedean, and assume that F is complete. Let 
R be the valuation ring, let p be the unique maximal ideal, and let f(X) be a 
polynomial in R[X]. If f(X) is the reduced polynomial with coefficients in R/p 
and if @ is a simple root of f(X), then f(X) has a simple root a € R whose 
image in R/p is a. 


PROOF. Let ao be any member of R whose image in R/p is a. The assumptions 
imply that f (ao) is in p and that f’(ao) is in R but not p. Thus the hypotheses 
of Theorem 6.28 are satisfied, and the theorem produces a root a of f(X) with 
a—adg inp. 


EXAMPLES WITH F = Qy AND R = Zp. 


(1) Suppose that p is an odd prime and that n is an integer for which the 
Legendre symbol (5) is +1,i.e., for which GCD(n, p) = 1 and n has a square 
root modulo p. Then x has a square root in Z,. This is immediate from Corollary 
6.29 with f(X) = X?—n. 

(2) Suppose that p = 2 and that n is an integer!” having the form 8k + 1. The 
maximal ideal in Z, is (2). Corollary 6.29 is not applicable to f(X) = X ene 
since evaluation of the derivative f’(X) = 2X at any point of Zp leads to a 
member of the ideal (2). However, we can apply Theorem 6.28. Let ap = 1, 
so that f(ao) = 1 —n and f’(ap) = 2. The theorem produces a root a in Z, if 
[1 —n|,/|2|3 < lie. if |1—nl, < 4. Since |1 — nl, =|— 8k, = §lkl, < 4, 
the theorem indeed applies. The resulting root a in Z) has a = 1 mod (2). 


!2Tn fact, n could be a 2-adic integer in this argument. 


352 VI. Reinterpretation with Adeles and Ideles 


(3) Suppose that p > 3. Every nonzero residue a@ in Z/pZ has a?-! = 
1 mod p. Corollary 6.29 shows immediately that the polynomial X?~! — 1 has 
a root a whose image in Z,/pZ, is a. Since the elements a are distinct, we 
conclude that Z,, contains all p — 1 of the (p — 1)* root of unity. 


(4) As promised at the beginning of Section 4, we show that Q, for p = 5 
is obtained also by completing the field Q(i) with respect to a certain absolute 
value and that in fact there are two distinct equivalence classes of absolute values 
on Q(i) for which Qs results. Thus let F = Q, K = Q(i), and p = (5). The 
prime factorization of (5)Z[i] is as (2+1)(2 —i). If we put P; = (2+ i) and 
Py = (2 — i), then K p, and Kp, are both equal to Qs because Example | above 
shows that the square roots of —1 already appear in Qs. If a is one of the square 
roots, then |2+a|. |2—a|, = |(2+a)(2—a)|, = |S]; = 4. Thus one of |2+a], 


and |2 —a | 5 equals i and the other equals 1. What is happening is that there are 
two field mappings Q(i) — Qs. For each of them, the effect on the base field Q 
is the same; however, one field mapping sends i in Q(/) to a in Qs, and the other 
sends i to —a. For definiteness, let us say that |2 +a | 5= i. Then the valuation 
of Q() with respect to P;} = (2 +7) is consistent with the 5-adic valuation of Qs, 
but the valuation of P: = (2 — i) is not. This example shows why the definition 
of completion insists on a mapping of valued fields (respecting absolute values), 
not merely a mapping of fields. 


(5) Suppose that p = 2. The question is the prime factorization of f(X) = 
X3+X?—2X +8 in Z). This polynomial was studied at length toward the end of 
Section V.4 in connection with common index divisors. It is irreducible over Q, 
but we are to factor it over Q. We shall show that it splits into first-degree factors. 
Considering the polynomial modulo 2, we find that f(X) = (X — 1)X 2 mod 2. 
Since | is a simple root modulo 2, Corollary 6.29 says that there exists an element 
@, in Zz such that f (6,;) = 0 and 6; = | mod 2. Dividing f(X) by X — 0, we 
obtain 


f@O)=&% =O) (K+ GO FHL +611 +1) = 2), 


To show that the quadratic factor splits over Q), it is necessary and sufficient 
to show that its discriminant is a square, since Q» has characteristic 0. The 
discriminant is 


(1 + 1° — 411 + 1) — 2) =4(G@ + DY - GG +) —2)), 
and we can ignore the square factor of 4. We know that 6; = 1 mod 2. Let us 
compute 6; modulo 8Z, by writing 6; = 8g + c with g € Z, and withe = +1 
or +3. Substituting into f (X) and computing modulo 8Z»2, we have 


0 = f(0)) =c? +c? — 2c mod 8Zy. 


6. Ramification Indices and Residue Class Degrees 353 


Since c is odd, c? = c and c? = 1 mod 8. Thus 0 = c + 1 — 2c mod 8 and 
c = 1 mod 8. Consequently 


(4@ + D)? — @.@1 +L) — 2) = 1 mod 8. 


By Example 2 any 2-adic integer that is = 1 mod 8Z, is a square in Zz, and thus 
f (X) indeed factors over Zz as the product of three first-degree factors. 


We conclude this section with a version of Hensel’s Lemma that we state 
without proof.!? This version deals with factorizations rather than roots. Briefly 
it says that we can lift a relatively prime factorization modulo p to a factorization 
in R[X] if at least one of the two factors modulo p has leading coefficient 1. This 
theorem certainly implies Corollary 6.29. 


Theorem 6.30 (Hensel’s Lemma). Let F be a field with a nontrivial discrete 
absolute value, necessarily nonarchimedean, and assume that F is complete. Let 
R be the valuation ring, let p be the unique maximal ideal, let k be the residue 
class field, and let f(X) be a polynomial in R[X]. Suppose that there exist 
polynomials go(X) and ho(X) in R[X] such that go(X) mod p and ho(X) mod p 
are relatively prime in k[X], go has leading coefficient 1, and f (X) factors modulo 
pas f(X) = go(X)ho(X) mod p. Then there exist polynomials g(X) and h(X) 
in R[X] such that g(X) has leading coefficient 1, g(X) = go(X) mod p, h(X) = 
ho(X) mod p, and f(X) factors in R(X) as f(X) = g(X)A(X). 
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Sections 14 have presented the ingredients of a two-stage process for analyzing 
congruence information, and now it is time to use everything together. The goal 
is to have techniques for extracting information about a global number-theoretic 
problem by seeing what the problem says about ideals, for reducing the questions 
about ideals to questions about powers of prime ideals, and for then assembling 
the results. 

We give one illustration of the utility of our constructions: With the techniques 
we had in Chapter V, we gave only a partial proof of the Dedekind Discriminant 
Theorem (Theorem 5.5). By contrast, we shall see in Section 8 that the present 
techniques lead naturally to a complete proof. 

Although we might want to work just within one number field, it is helpful to 
change the context so that we are comparing a number field with a finite extension. 
There is no loss of generality in doing so; we can always take the base field to 


'34 proof may be found in Hasse’s Number Theory, pp. 169-172. 
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be the rationals Q, and the effect is that we consider only the finite set of prime 
ideals for the extension field that contain a given prime number p. 

As long as we are going to consider finite extensions of fields in addressing 
number theory, we might as well treat also the case of function fields in one 
variable, at least to the extent that the two theories are quite analogous. Thus we 
are led to the following set-up. 

Let R be a Dedekind domain considered as a subring of its field of fractions 
F, let K bea finite separable'* extension of F with [K : F] = n, and let T be 
the integral closure of R in K. We shall work with F and K as valued fields, 
having some absolute value on them. The case of interest in this section will be 
that the absolute value is nonarchimedean and arises from a discrete valuation 
whose valuation ring contains R or T , respectively. Theorem 6.5 shows that the 
valuation is defined by means of some prime ideal 9 of R or T , and the associated 
absolute value may thus be denoted by an expression!> like | - | ns 

We start from a prime ideal p in R and form the corresponding absolute value 
on F as in Section 3, obtaining a valued field (F, | - |,). Then we complete as in 
Section 4, writing the completion as 


Yo: CF 1+ Ip) > Gps + Ip): 


We know that the ideal pT in T has a prime factorization of the form pT = 
Pi oo pS , where P|,..., Py are distinct prime ideals in 7. The integers e; are 
called ramification indices and the dimensions f; = dimp/p(T/P;) are called 
residue class degrees. We are interested in saying everything we can about 
P,,..., P, and about the indices e; and f;. The fundamental relationship is 
given by Theorem 9.60 of Basic Algebra, namely 


Su =n. 


i=l 


We know that each P; gives us a nonarchimedean absolute value | - |, on K, 
unique up to equivalence, and then a completion 


Wi: (Ks | + Ip) > (Kas l - Ip): 


'4The role of separability will become apparent before the statement of Theorem 6.31 below. 

'SThe number-theory case ultimately requires also a limited amount of analysis of archimedean 
absolute values, and that will be carried out in Section 9. In the context of passing from a Diophantine 
equation to congruence information, part of the role that archimedean absolute values play is in 
analyzing signs. Thus for example the simple-minded equation x? + y? = —1 has no solutions in 
integers; the reason for the absence of solutions is a constraint on signs, not some limitation from 
congruences with respect to powers of primes. Archimedean absolute values control signs. 
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The first important step is to establish an isomorphism involving fields such 
that the identity )°7_, ef; = n is a dimension formula that follows from the 
isomorphism. The identity in question concerns the ring K @- Fy, which is 
a commutative algebra over K or over F,, whichever we like, and which is 
semisimple by Corollary 2.30 under our assumption that K is a finite separable 
extension of F. The Wedderburn theory (Theorems 2.2 and 2.4) shows that 
K rp Fy is isomorphic to a finite direct product of fields,!° each of which is a 
finite extension of Fy. What we shall prove later in this section is the following 
theorem. 


Theorem 6.31. Let R be a Dedekind domain considered as a subring of its 
field of fractions F’, let K be a finite separable extension of F with [K : F] =n, 
and let T be the integral closure of R in K. If p is a nonzero prime ideal of R 
and if the ideal pT in T has a prime factorization of the form pT = P;'--- PY 
where P;,..., P, are distinct prime ideals in T and the e; are positive integers, 
then 


& 
K ®r Fy =| [ Ka. 
j=1 


When the formula ee e; fj = mis specialized to the field extension K p,/Fy, 
it becomes eF fF =[K P; : Fy], where ej and Ff; are the ramification index and 
residue class degree associated to K p,/ Fy. If we accept for the moment the result 
of Lemma 6.36 below that e; and f;* coincide with the corresponding indices e; 


and f; for K/F,thenn = )°%_, ej fj) = 1 ef ff = Dj_i[Kp, : Fp] indeed 
counts the Fy dimensions of both sides of the formula K @p Fy = = Kp, 
in the theorem. The theorem says much more than this, and we shall mine its 
consequences after giving the proof of the theorem. 

For orientation, let us recall Example 4 from Section 5. In that example, we had 
R=Z,F=Q,K =Q(),T = Zi], p = 5Z, and F, = Qs. The factorization 
p=] PP is 5Z[i] = (2+17)(2 — i), and the two completed versions of K are 
Keas+i) = Qs and K(2-i) = Qs. Thus the identity in the theorem specializes to 


QG) Ge Vs = Qs x Qs. 


Proving the identity on this level would be more challenging than necessary 
because the isomorphism cannot be unique; it can always be composed with 
the interchange of the two factors on the right side. For this reason the proof 
makes use of valued fields, and then in effect the desired isomorphism becomes 
a constructive one that we can write down rather explicitly. 


‘©The words “direct product” in connection with finitely many fields refer to the direct sum of 
the additive structures, with multiplication given coordinate by coordinate. 
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Let us now work toward proving Theorem 6.31. Above, we mentioned the 
completion mapping Wo for F relative to an absolute value in the equivalence 
class determined by p, as well as w; for K relative to some absolute value in the 
class determined by P;. In addition, we have inclusion mappings corresponding 
to the field extensions K/F and Kp,/F,. Figure 6.1 below is a square diagram 
that assigns the names @p and 4; to these as well. 


FIGURE 6.1. Commutativity of completion and extension as field mappings. 


The diagram in Figure 6.1 commutes. In fact, wg and gjWo are both F 
homomorphisms, being compositions of F homomorphisms, and hence x € F 
implies Wjo(x) = x(WjeC)) = x1) = x(GjWoU)) = 9 Wo). 

But more is true: we are going to impose absolute values on the four fields in the 
diagram in such a way that the four field mappings are homomorphisms of valued 
fields. We have already defined | - |, on F as any absolute value corresponding 
to p, and then | - |,, is defined on Fy in such a way that the completion mapping 
Wo preserves absolute values. Theorem 6.33 below will enable us to define an 
absolute value in a unique fashion on K p, such that g; preserves absolute values. 
Proposition 6.34 will give us the definition of an absolute value on K , and we 
shall check in Lemma 6.35 that Figure 6.1 with these absolute values in place is 
a commutative diagram of valued fields. Finally we use this commutativity to 
prove in Lemma 6.36 that the ramification index e; and residue class degree f; 
for K p,/ Fy match the corresponding parameters e; and f; for K/F, and then we 
are ready for the main part of the proof of the theorem. 

We begin our preliminary work by limiting the possibilities for a finite exten- 
sion of a complete valued field (F, | - |,,). If K is a finite extension of F,a norm 
on the F vector space K relative to | - |, is a function || - || from K to R having 

Gi) ||x|| => 0 on K with equality if and only if x = 0, 
(ii) |lex|] = lel pllx|| fore ¢ F andx eK, 
(i) ||x + yl] < |x|] + llyll for all x and y in K. 


Lemma 6.32. If (F, | - |,-) isa complete valued field, if K is a finite extension 
of F, and if || - ||, and || - ||, are any two norms on K relative to | - |,,, then there 
exist real constants C and C’ such that 


Ill, <Cllxll, and |x|, <C'llxl], — forallx e K. 


Consequently K is Cauchy complete in the metric induced by either norm. 
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REMARK. It is not important that K be a field in this lemma, only that it be a 
finite-dimensional vector space over F’. 


PROOF. Let = dim,y K. Fixing an ordered basis (x,,...,x,) of K over F, 
we may express any member x of K in the form x = )7"_, c;x; with all c; in F. 
With the c;’s defined this way, we define ||x lane = max|<j<n |c;|,. To prove the 
displayed inequalities, it is enough to prove them for || - || 
|| - ||. For one direction of the inequality, we have 


xl = || Xcel] < X; lleexell =X; lel ellaell S (CO; ell) ll sup- 


This proves that ||x|| < Cx] up with C = 7; [lvl 
For the reverse inequality we shall prove by induction on k that an inequality 
IX Il sup < C;||x|| holds for all x in the F linear span of at most k of the vec- 


tors x1,...,%X,. The base case for the induction is k = 1, and then II Il sup = 


sup and any other norm 


||x; || 7 x || whenever x is a multiple of x;. So C) = maxj<j<n((lxj||7!). 

Assume that C,..., Cj, exist and that we are to produce C,,,. Arguing by 
contradiction, we may assume that there is some sequence {x} in K ,each term 
having at most k + 1 nonzero coefficients, such that |[x“”|| = 1 for all m and 
Ila Il up tends to infinity. Possibly by passing to a subsequence, we may assume 
that the nonzero coefficients of x” all lie in a particular subset of k + 1 of the 
coefficients, and there is no harm in assuming that this subset is {1,...,k + 1}. 
Passing to a further subsequence, we may assume that there is some index j such 
that the largest coefficient of each x) when measured by | - |, is the ian and 
there is no harm in assuming that 7 =k +1. 


Let em: SS, ote be the coefficients of x”, so that x” = pean eee Put 
y™) = (Wey wae =p‘, a” x; + xx41, where a” = (ote. Here 
\d{”” |. < 1 for 1 <i < k and for all m, and also |ly || = |e) |p! Ix |] = 


lene tends to 0. 


For each vector y™ —Xxx41, only the first k coefficients can be nonzero, and the 
same thing is true of differences y“” — ym) of two such vectors. The inductive 
hypothesis tells us that ly’ — yup < Cilly — y |], and the right 
side tends to 0 as m and m’ tend to infinity because || y“”” || and yor || tend to 0. 
Therefore the 7 coordinate of y“") forms a Cauchy sequence. Since F is given as 
complete, {y”)} is convergent in the norm || - lhe to some y = Ee diXj +Xx41 
in K. 

By the easy direction of our inequality, || y” — y|| < C lly’ —y]] sup: The right 
side tends to 0, and hence so does the left. We know that || y || tends to 0, and 
hence y = 0. But this conclusion contradicts the form of y as ean djX; +Xe41 
with coefficient 1 for x... We conclude that C; 41 exists as asserted, and the 
lemma follows. 
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Theorem 6.33. If (F, | - |,-) is a complete valued field relative to a nontrivial 
nonarchimedean discrete absolute value and if K is a finite separable extension 
of F with [K : F] =n, then K has a unique absolute value | - |, extending 
| - |, K is complete and nonarchimedean, and the integral closure T in K of 
the valuation ring R of F is the valuation ring of K. The extension is given by 


1 
IX] =IN«pe(x)|". 


REMARKS. Since T is the valuation ring, Proposition 6.2 shows that T has a 
unique nonzero prime ideal. It follows that if p is a nonzero prime ideal of R, 
then pT = P® for a single prime ideal P of T. We shall make frequent use of 
this fact in applications without explicit mention. 


PROOF. For uniqueness, suppose that | - |, and | - |, are two absolute values on 
K that extend | - |,,. Let us see that each of these is anorm on K relative to| - |. 


In fact, what needs checking for | - |, is that the function respects scalars from 
F appropriately. If c is in F and xo is in K, then |cxo|,; = Icl,|xol, = lel -lxol,, 
the second equality following because | - |, restricts to | - |, on F. A similar 


argument applies to | - |,, and thus we are dealing with two norms. 

If the two given absolute values are inequivalent, then Proposition 6.12 shows 
in the presence of the nontriviality of | - |, that we can find an x € K with 
|x|, > 1 and |x|, < 1. Then lim; |x~*|, = 0 while |x~*|, > 1 for all k. 
Consequently there cannot exist a constant C such that |y|, < Cly|, for all 
y € F,in contradiction to Lemma 6.32. 

We conclude that | - |, and | - |, are equivalent, say that |x|, = |x|} for all 
x € K andsome s > 0. Since | - |, is nontrivial, there exists some xo € F 
with |xo|, > 1. The equality |xo|, = |xo|5 then implies that s = 1. This proves 
uniqueness. 

We turn to existence. Proposition 6.2 shows that the valuation ring R in F for 
the discrete valuation vr corresponding to | - |, on F is a local principal ideal 
domain and that the valuation ideal p is the unique maximal ideal of R. Theorem 
6.5 shows that the valuation vp determined by p is the same as the given valuation 
up. Hence | - |, is given for alla € F by |a|,, =r~’*™ for some rr > 1. Let x 
be a generator of the principal ideal p of R. 

Since K /F is finite and separable, Theorem 8.54 of Basic Algebra shows that 
the integral closure T of R in K is a Dedekind domain. Let pT = Py! --- Py’ 
be the factorization of the ideal pT of T into the product of powers of distinct 
prime ideals of T. Each P; defines a nonarchimedean valuation vp, of K. If a 
is any element of F, then we can write a = 2*u for some u € R* and some 
integer k. The computation aT = aRT = 2*uRT = x*RT = nT =p'T = 
ee Py shows that vp(a) = k and that up,(a) = ke;. Hence vp, = ej vp on 


i 
F, and therefore the formula |x| R= GaN ® forx € K defines an absolute 
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1 


2 =1 
value on K that has jal, =r7’?© =r“% Yr @) = (pe 4 


a lal p, for all 
a in F. This proves existence. The absolute value | - | p, on K is complete by 
Lemma 6.32 and is nonarchimedean because it is given by a discrete valuation. 

Let us show that g = 1. Arguing by contradiction, suppose that there are at 
least two distinct prime ideals P; and P of T that contain p. Since P} + P2 = T, 
we can choose x; € P; and x2 € P) with x} + x2 = 1. Then vp,(x;) > 0 and 
vp, (1) = 0, from which we see that vp, (x2) = 0. Since up, (x2) > 0, we obtain 
a contradiction to the uniqueness part of the theorem. Thus the prime ideal of T 
is unique. Let us write P for this ideal. 

We know that vp(T) > 0,1.e., that T is contained in the valuation ring of up. 
Proposition 6.4 shows that the valuation ring of vp equals S~'T, where S is the 
complement of P in T. The uniqueness of P means that T is local, and hence 
every member of S is a unit in 7. Thus S~!T = T, and T is the valuation ring. 

Write | - |x in place of | - | P To prove the explicit formula for | - |, in 
the statement of the proposition, choose a finite Galois extension L of F that 
contains K; such a field L exists because K /F is separable.'’ By the existence 
just proved, let | - |, be an extension of | - |, to L. Ifo is in Gal(L/F), then 
x > |o(x)|, and x +> |x|, are both absolute values on L that extend | - |. By 
the uniqueness just proved, |o(x)|, = |x|,. Applying | - |, to both sides of the 
formula Nz (x) = | |oecansr) & ) gives 

Myre =INer@l,= TL lo@l, = lle (*) 

o€Gal(L/F) 
If x is in K, then the left side equals (|Nx/#(x)|,)'*!, and the right side equals 
(| ERMA] = (jx) K1, Thus the desired formula follows by extracting 
the positive [L : K ]® root of both sides of (*). 


Proposition 6.34. Under the hypotheses of Theorem 6.31, let vp be the 
valuation of F defined by p, and let v p, be the valuation of K defined by P;, 


1<j <g. Thenejvpy = ee Consequently if | - lp is an absolute value on 
F defined by p, then for each j some member | - | P; of the equivalence class 


of absolute values defined on K by P; is an extension of |: In this case the 


lp: 
inclusion of (F, | - Ip) into (K,| - | p) is ahomomorphism of valued fields. 


PROOF. Let S be the multiplicative system in R given as the set-theoretic 
complement of p in R. For the first conclusion Proposition 6.4 and Theorem 6.5 
together show that it is enough to prove that 


€jUs-1p = Ug-1p,| p- (*) 


'7The field L can be taken to be a splitting field of the minimal polynomial over F of an element 
€ such that K = F(&). The extension L/F is separable by Corollary 9.30 of Basic Algebra. 
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From the identity 
eg 


pT = Piles Pe, 


we have 
S7'pT = (S7'P,)* «-- (S7'P,)*. (40k) 


Since S is the complement of p in R, vp is 0 on S. Hence vs-ip is 0 on S. From 
ROP; =p,we have SN P; C SO p=. Thus the members of S lie in R C T 
bit inno P;, and vp, ison S. Hence vs- Pp; isOon S. 

Let zr be a generator of the principal ideal S~'p in S~! R, so that v sip() = 1. 
SincexS-!T = S~'pT , equation («*«) shows that Us-! p, (a) = e;. Each element 
y of F is of the form y = zu for some integer k and some u € F with 
Us-ip(u) = 0. The element uw must be in S—!R but not S~'p and hence is in 
S—!. Thus v5- 1p,(u) = 0. We have now seen that vs- IP, (x) = ejUs-1p (x) for the 
element x = u above and also for x = 2. Therefore vs-1 P, (x) = ej Vs-1p (x) for 
all x € F, and () is proved. 

Now that ejvp = vp,| ps 
If r’ is defined by r = (r’)*’, then the definition Ixlp = (ry for x € K 
restricts for x € F to Ix|p, = (r= (7!) O) =p) = |x|,,, and the 
inclusion is indeed a homomorphism of valued fields. 


choose r > | such that xl, =r’ forx € F. 


With these facts in place, let us make Figure 6.1 into a commutative diagram 
of valued fields. From p, we use any corresponding choice of | - |,, on F, and 
this uniquely determines an absolute value by the same name on Fy. Next we 
apply Theorem 6.33 to the inclusion ; : Fy — Kp, to obtain a unique extension 
of | - |p from Fy to an absolute value | - Ip, on Kp,. 

Meanwhile, with the index j specified, Proposition 6.34 gives us a unique 
absolute value | - | P; on K such that the inclusion gp : F — K isahomomorphism 
of valued fields. The completion mapping y; : K — Kp, in turn gives us a 
second determination of | - | p, on Kp,, and Lemma 6.35 below says that these 
two determinations match, i.e., that Figure 6.2 is a commutative diagram of 
homomorphisms of valued fields. 


CFs sey = eg 


| I 
CEA py Ral Ip) 


FIGURE 6.2. Commutativity of completion and extension 
as homomorphisms of valued fields. 
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Lemma 6.35. In the above notation the two determinations of | - | p, on Kp, 
coincide—one by using Theorem 6.33 to insist that yo in Figure 6. 2 be the 
composition of homomorphisms of valued fields, and the other by using Proposi- 
tion 6.34 to insist that yj in Figure 6.2 be the composition of homomorphisms 
of valued fields. 


REMARKS. The commutativity formula jg = gj Wo for field mappings is 
known from the discussion concerning Figure 6.1. 


PROOF. Let us give two different names to the two possible absolute values on 
Kp,, writing | - |’ for the one that makes |y;(k)|' = IKI p, fork € K and writing 
| - |” for the other, which makes |g; (x)|" = |x|, forx € Fy. Let y bein F. Then 
the equality gj %o = Wj Go implies that 


le Vol = Yj PoO! = lO) |p, = Iylp = Wop. (*) 


If xo is given in Fy, then we can choose a sequence {x,} in F with {Yo(x,)} 
convergent to x9 in Fy. Then {wWo(x,)} is Cauchy in the metric on F,, and 
it follows from (*) applied with y = x, — x, that {gj Wo(xn)} is Cauchy in 
the metric from | - |’ on K p,. If we have a second such sequence {x/} in F 
with Wo(x/,) convergent to xo and if we alternate the terms of {x,} and {x/} to 
produce a sequence {z,,}, then {~; Wo(Zn)} remains Cauchy in the metric from | - |’. 
Since | - |’ is complete, it follows that |g;(xo)|’ is given by a well-defined limit 
independently of the sequence in Wo(F’) used to approximate xo. The formula 
(*) shows that |g; (xo)|’ = |xo|,, and the definition of | - |” shows that this equals 
|p; (xo)|". By the uniqueness in Theorem 6.33, | - |’ =| - |” on Kp,. 


Lemma 6.36. In the above notation and that of Theorem 6.31, the ramification 
index e; corresponding to K p,/F, for the closure of the ideal yj(Pj) coincides 
with the ramification index e ; corresponding to K /F for the ideal P;. 


REMARK. In addition, the residue class degree fj * for K p,/ Fy coincides with 
the residue class degree f; for K/F. In fact, the five paragraphs of review that 
follow Theorem 6.26 mention that residue class fields change neither during the 
localization step nor in the completion step of our two-step process. Thus R/p 
remains the same during the two steps, and so does T/ P;. Hence the dimension 
of T/P; as a vector space over R/p remains the same. 


PROOF. Let Upp, UP,.K» Up, fF,» and UP).Kp, be the valuations corresponding to 
the absolute values on F,, K, Fy, and K Ps respectively. The last of these is well 
defined by Lemma 6.35. Proposition 6.34 shows that 


* 
Cj Up, F = UP,.KPO and Cj Up. Fy = UP),Kp, Pi- (*) 
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Meanwhile, the completion mappings Wo and w; satisfy 
Up, Fp Wo => Up F and UP. Kp, Wi => UP,,K- (+) 


Multiplying the second equation of () on the right by Wo and substituting from 
the first equation of («*), we obtain 


# * 
ej Up F — ej Up, Fy Wo = UP), Kp, Pi W0- 


We substitute from the commutativity formula gj Wo = Wj ¢o and unwind the right 
side as 


UP). Kp, Wj PO = UP;,KPO = €jUp,F- 


, ar ; Se Se Me on 
Thus 7 Up, 7 = €jUp,r. Since Up, ¢ is not identically 0, we obtain e; = e;. 


PROOF OF THEOREM 6.31. As was mentioned before the statement of the 
theorem, it follows from Proposition 2.29 and the Wedderburn theory that K @ r Fy 


is isomorphic to a product te L; of fields, each of which is a finite extension 
of F, and each of which has K embedded in it. The subfields L; are uniquely 
determined within K @- F,, and we let n; be the projection of K @f Fy, onto 
L;. Each nj; is aring homomorphism and is given by multiplication by a specific 
element of K @ Fy, namely the element that is 1 in the i position and is 0 in 
the other positions. When restricted to K @ 1, n; gives a field map a; : K > Lj; 
when restricted to 1 @ Fy, it gives a field map f; : Fy > Lj. 

We shall develop a small abstract theory about these field maps a; and ;. 
Suppose that M is a field containing F’, thata : K — M and B: Fy — M are 
F algebra homomorphisms, and that M is a finite separable extension of B(Fy). 
Theorem 6.33 says that M has a unique absolute value | - |p g extending | - |p 
and that the valued field (M/, | - |p,g) is complete. The extension property means 
that B : (Fy,| - lp) + (M,| - lp.) is a homomorphism of valued fields. The 
restriction a*(| - |p) to K makes (K, a*(| - |p,g)) into a valued field in such a 
way that 


a: (K,a*(| + |p,p)) > (M,1 > Ip,p) (*) 


is ahomomorphism of valued fields. Let us see that 
a*(| + |p,g) is one (and only one) of the absolute values | - Ip, on K(x) 
and that a in (*«) factors as the composition of the completion mapping 


Wis (KI lp) > (Ke, 1: Ip) 


J 
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followed by some other homomorphism of valued fields 
L: (Kp,,| > |p) > (M,| - Ip.p)- 
To get at («*) and the factorization of a, let us show that the field mapping 


go: (F,| + Ip) > (K, "(| = |p.) (1) 


is a homomorphism of valued fields, i.e., that yja*(| - |p.g) =| - |p. The field 
mappings ag and Pwo, which carry F into M via K and Fy, respectively, are 
compositions of F homomorphisms and hence are F homomorphisms. Therefore 
x € F implies that ago(x) = x(ago(1)) = x1) = x(BWo(1)) = BWo(x), and 
we see that ago = BY on F. For x € F, this identity accounts for the third 
equality in the following computation proving (7): 


Ixlp = [Worly = [BV orl, p 
= lagorly p =2°( + |p.p)Gox) = ga (| + |p.p)(2). 


Returning to («*) and applying (7), we see that a*(| - |p,g) is < lon R. Since 
T is the integral closure of R, Proposition 6.20 shows that a*(| - |pg) is < 1 
on T and that it arises from some nonzero prime ideal of 7, necessarily one of 
the ideals P;,..., P,. This proves (**«). Then the factorization («) follows from 
(**«) and the universal mapping property of completions as given in Theorem 
6.25, since (M, | - |p.) is complete. 

Now let us specialize by taking M = L, with i fixed. As in the first 
paragraph of the proof, the projection 7; : K @r Fy — L; gives us field mappings 
a; : K — L; and B; : Fy > L; by composing n; with K > K @ 1 and 
with Fy > 1@ Fy. If w1,..., un is a vector-space basis of K over F,, then 
uy @1,...,U, @ | is a vector-space basis of K @pF Fy over F, and it follows 
that L; is finite-dimensional over Fy. Let us check that L; is separable over Fy. 
We are given that K is separable over F', hence that K = F(&) for an element 
&€ whose minimal polynomial g(X) over F is separable. Then € @ 1 is a root of 
g(X) regarded as in F,[X], and so is n;(§ ®@ 1). Therefore L;/F is separable, 
and the above theory is applicable. In the theory, L; acquires an absolute value 


| - |p,p, Such that 8; : (Fp,| + |p) > (Li, | - |p,g,) is a homomorphism of valued 
fields, and then (L;, | - |p,g,) is complete. The theory produces a unique index 
j = j(i) making a; : (K,| - |p,) > (Li,| - |p.g;) into a homomorphism of 
valued fields. 


Let us see that a; (K ) is dense in L;. Every member of L; is the image under n; 
of some member aa uj; ® c; of K @p Fy with each c; in Fy. The computation 


ni (uy @ ci) = ni (uy ® Li (1 @® c7) = a (uy) Bi (C1) 
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shows that every member of L; is of the form ae a; (u;) PB; (c;). Since F is dense 
in Fy, we can choose members c; of F as close as we please to c;. Since f; is 
isometric, )7/_, a (uy); (cz) is then close to )7/_, a (ur) Bi(c;) = S77 ai (cy). 
Consequently a; (K) is indeed dense in L;. 

Recall in connection with («) that a; : K — L; factors as a composition of 
homomorphisms of valued fields, namely as y; : (K,| - Ip) ee Ip) 
followed by ¢ : (Kp,, | - Ip) — (Li,| + |p,g,). Since Kp, is complete, U(Kp,) 
is closed in L;. The dense image a;(K) = (yj;(K)) in L; is contained in the 
closed subset 1(K P,)s and it follows that ¢ is onto L;. That is, the homomorphism 
of valued fields 

L: (Kp, Ip) > Lisl > |p.ai) 


is an isomorphism. This identifies the valued field (L;, | - |p,,) as isomorphic to 
(Kp,.| + p)- 

As a consequence of the argument thus far, we have constructed a choice-free 
function i +> j(i) carrying {1,..., g’} into {1,...,g}. The function has the 
property that K p,,. is isomorphic as a valued field to L; for each i. We are going 
to show thati +> j(i) isonto {1,..., g}. Thus let the completion homomorphism 
Wj (K.| > |p.) > (Kp,.| + |p,) be given. 

The F bilinear mapping (Wj, gj) : K x Fy > Kp, given by multiplication 
has a linear extension 


Vj @ 9: K @r Fy > Kp, 


that is a ring homomorphism. The range Kp, is a field that is finite-dimensional 
over yj(F,), and the image of Wj ® yj; is a yj(Fp) vector subspace of K p, that is 
closed under multiplication. Consequently the image of w; ® g; is closed under 
inverses!® and is a field. The kernel of yj ® g; is therefore a maximal ideal, and 
it follows that there exists some i such that w; ® g; factors as a composition of 
ni: K @r Fy > L; followed by a field map y : L; > Kp,. 

Having constructed a particular L;, let us form a;, B;,and Pj) as in the abstract 
theory with M. The map f; : (Fp, | + |p) > (Li, | - lpg) is ahomomorphism of 
valued fields such that yf; = g;, and the map a; : (K, |, - IP.) => (Li,|- lp .B) 
is a homomorphism of valued fields such that ya; = w;. The existence part of 


Theorem 6.33 shows that there exists an absolute value | - |, on Kp, such that 
y: (isl: ) —> (Kp,,|- I) is a homomorphism of valued fields. Since 
GF 1p)) =1+ Ip = AFC + Ips) = BRY*( > ly) = 9F(/ - |,))sthe uniqueness 


'8The same argument applies here with F, p as was used in Section 4 with R: within a field if a 
nonzero element is algebraic over a base field, then the smallest ring containing the base field and 
the element contains also the inverse of the element. 
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part of Theorem 6.33 shows that | - |,, =| - | p, on K p,. Meanwhile, the equality 
Wj = ya; implies that y* = a; y*. Then we have 


(| Ip, on K) = vf(I = Ip, on Ke) 


=ajy*(|-|ponKp) — since yf =afy* 
=ajy*(| - Ly on Kp) since | - |, =| - lp, 
=a; (| . ly. B; on Li) 

= (lee lias) on K). 


Therefore j = j(i), and the mapi +» j(Z) is onto. 

To complete the proof, let us compute dimensions relative to Fy, starting 
from the decomposition into fields L;. The ramification index e; and the residue 
class degree f;* for the valuation ring and ideal of K p, equal the corresponding 


parameters e; and f; for T and P;, by Lemma 6.36. Thus we have 


8 8 8 
n= a dim, L; => > dim, Kp, => ae » dim, Kp, 
i=l i=l j=1j@=i 


= =>} 2 joie = DD De e@fiw) = = Iti | JO=Sl ej F;- 


j=l ji =1j@=j 


On the other hand, we know that n = }/e; f;, and we have just proved that 
{i | j@) = j}| = 1 for each 7. It follows that |{i | 7@) = j}| = 1 for each 
j, ie., that the function i t j(i) is one-one onto. In particular, g’ = g. The 
theorem follows. 


Notationally what is happening in the proof of the theorem is that a function 
i +> j() is constructed such that a; : K — L; factors as a; = tw) for some 
canonical isomorphism ¢ : Kp,,. > L; of complete valued fields. Renumbering 
the factors and ignoring canonical isomorphisms, we find that K @F Fy is the 
direct product of the factors Kp, and that a; = w; carries K to K @ 1 and then 
to the i factor K p,. Any linear mapping of the form A @ | in effect is therefore 
block diagonal with each block corresponding to the effect on some K p,. 

Let us apply these considerations to operations “left-multiplication-by,’ which 
we write as /(-). If € is a member of K, the characteristic polynomial of /(&) 
over F is det(X1 — /(&)), and the characteristic polynomial of /(€) @ 1 over Fy 
is still det(X 1 —/(€)), but now with its coefficients from F' regarded as members 
of Fy via the inclusion Ww: F > Fy. 

The linear function X(1 ®@ 1) — J(€) @ 1 is block diagonal, equal to 
X1 — 1(W;(E)) on the i block for 1 < i < g. The characteristic polynomial 
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det(X1 — /(&)), regarded as having coefficients in Fy, is therefore the product 
of the g characteristic polynomials X1 — /(w;(€)), each with coefficients in Fy. 
In turn, this product formula yields a sum formula for the trace Trx,;-(&) and 
a product formula for the norm Nx /-(&). If & is a primitive element for the 
extension K /F,, then we can say even more. Let us write all these consequences 
as a corollary. 


Corollary 6.37. Let R be a Dedekind domain regarded as a subring of its field 
of fractions F, let K be a finite separable extension of F with [K : F] =n, and 
let T be the integral closure of R in K. Let p be a nonzero prime ideal of R, and 
let the ideal pT in T have a prime factorization of the form pT = Pf! --- Py’, 
where P|,..., Pg are distinct prime ideals in T and e),..., €g are positive. For 
1<i<g,let f, =[T/P; : R/p]. If € is any element of K, then 

(a) the F linear map /(€) on K given by left multiplication by € has the 
property that its field polynomial det(X —/(€)) over F, when reinterpreted 
as having coefficients in Fy, factors over Fy as the product 


& 
det(X — 1(§)) = | [detex —1(&))) 


i=1 


of the g field polynomials of the images & = w; (€) under the completion 
map Ww; :K — Kp, 
(b) Nx (€) = [Ty NK, /F, &i), 
(c) Trxsr(€) = 9) TrKp,/F, (&i)- 
Furthermore, if € and F together generate K , if m(X) is the minimal polynomial 
of € over F, and if m(X) = gan mj;(X) expresses m(X) as the product of 
distinct monic irreducible polynomials in F,[X], then 
(d) g’=8, 
(e) there is a one-one onto function i + k(i) on the set {1,..., g} such that 
Kp, is isomorphic as a field to Fy[X]/(meiy(X)), 
(f) deg mui) (X) = 4 fi. 


PROOF. Conclusion (a) was proved in the paragraph before the statement of 
the corollary, and (b) and (c) follow immediately from (a). 

Under the assumption that K = F(&), the minimal polynomial m(X) of 
€ and the characteristic polynomial det(X1 — /(€)) are equal; thus m(X) = 
det(X — /(&)) is irreducible over F. Applying Proposition 2.29a, we see that 
K @F Fy = Fy[X]/(m(X)) as an Fy algebra. The assumed separability of K /F 
means that m(X) is a separable polynomial, and m(X) therefore factors over the 
extension field F, of F as a product of distinct monic irreducible polynomials 
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in Fy[X], say as m(X) = m,(X)---mg(X). The Chinese Remainder Theorem 
implies that 


Ps 

K @p Fy =] | Fol X1/mi(X)), 

i=] 

and each F'y[X]/(m;(X)) is a field. The factors on the right must coincide with 
the factors in Theorem 6.31, and it follows that g’ = g and that each K p, is of 
the form Fp[X]/(mx(X)) for some k = k(i). This proves (d) and (e). For (f), 
deg mi) (X) is the product of the ramification index and the residue class degree 
for Kp,/Fy, and this product equals e; f; as a consequence of Lemma 6.36 and 
its remark. 


A by-product of (d) is that we obtain a way of computing g for the extension: 
it is the number of irreducible factors into which m(X) splits when it is factored 
over Fy instead of F. Hensel’s Lemma in the form of Theorem 6.30 can help 
with carrying out this factorization in favorable cases if € is chosen to be integral 
over R,i.e., to be in T. Namely we reduce the coefficients of m(X) modulo 
p, obtaining a monic polynomial in (R/p)[X], and we factor this polynomial!” 
as a product of powers of distinct primes in (R/p)[X]. Since the powers of 
distinct primes are relatively prime and since everything is monic, Theorem 6.30 
is applicable and allows us to lift the factorization to F,[X]. The resulting monic 
factors in Fy[X] may not be irreducible in unfavorable circumstances,”? but we 
have at least made progress. 

Theorem 6.31 has accomplished even more than is stated in Corollary 6.37. 
For each /, it has identified a field extension, namely K p, / F, ,in which the indices 
e; and f; are isolated from the other e;’s and f;’s. Under an additional hypothesis 
on the residue class field (it is enough to assume that the residue class field is 
finite), Proposition 6.38 below shows that it is possible to interpolate a unique 
intermediate field L with F, C L C Kp, such that the residue class degree (the 
parameter f) of Kp,/L is | and the ramification index (the parameter e) of K / Fy 
is 1. Thus the proposition says that we can separate e; and f; from each other. 
One says that K p,/L is totally ramified and L/F, is unramified. 


Proposition 6.38. Let F be a complete valued field under a nonarchimedean 
discrete valuation v, let R and p be the valuation ring and valuation ideal for v, let 
K bea finite separable extension of F of degree n, let T be the integral closure of 
Rin K, and let P be the unique maximal ideal in T as in Theorem 6.33. Suppose 


'°On a computer, for example, if R/p is finite. 

20Tn Example 5 in the previous section, the given polynomial inZ[X]ism(X) = X3+X?-2X+8, 
and the reduced polynomial in F2[X] is X 2(X +1). Theorem 6.30 exhibits a factorization of m(X) 
over Z2[X] as the product of a linear factor and a quadratic factor, and we saw in Example 5 of 
Section 5 that the quadratic factor is reducible over Z2[X]. 
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that R/p is a finite field. Let e be the integer such that pT = P®, and let f be 
the dimension of T/P over R/p. Then there exists a unique intermediate field L 
for which the integral closure U of R in L and the unique maximal ideal 9 in U 
have the following properties: 


(a) pU =p and pT = P*, 
(b) (U/p : R/p] = f and[T/P : U/g] =1. 


The proof is carried out in Problems 15—16 at the end of the chapter. We shall 
apply Proposition 6.38 in Section 8. The intermediate field L in the proposition 
is called the inertia subfield of K/F. 

Once this separation of an extension of a complete valued field into a totally 
ramified extension and an unramified extension has been accomplished, one can 
go on to study each kind of extension separately, in order to find out what kind 
of ramification is possible. The results are stated as Lemmas 6.47 and 6.48, and 
proofs are carried out in Problems 17—19 at the end of the chapter. 


7. Special Features of Galois Extensions 


In this section we analyze what happens in the setting of Theorem 6.31 when 
the extension of fields is a Galois extension. For simplicity for the moment, let 
us work with the number-field setting, even though analogous results hold for 
function fields in one variable as well. Thus let K /F be a finite Galois extension 
of number fields, let T and R be the rings of algebraic integers in K and F 
respectively, and let p be a nonzero prime ideal in R. Since the extension K/F 
is Galois, the Galois group Gal(K /F’) permutes transitively the nonzero prime 
ideals containing p7, and the factorization of pT into powers of distinct prime 


ideals of T takes the special form pT = P/--- Pe with all the exponents the 


same.”! In addition, the dimension of each finite field T/P; over R/p is an 


integer f independent of i, and we have efg =[K: F]. 

Let us review Theorem 9.64 and its surrounding discussion in Basic Alge- 
bra. If we write P for one of the ideals P;, then the subgroup Gp of G = 
Gal(K /F) is called the decomposition group at P. Each o € Gp descends 
to an automorphism o of T/P that fixes R/p, thereby yielding a member of 
G= Gal((T/P)/(R/p)). The map G > G is certainly a homomorphism, 
and Theorem 9.64 of Basic Algebra says that it is onto. It follows that this 
homomorphism is e-to-1. In Basic Algebra this homomorphism was of interest 
when F = Q and e = 1, since it ensures the presence of certain kinds of 
permutations in G and makes it possible to determine G completely in certain 
circumstances. 


211 emma 9.61 and Theorem 9.62 of Basic Algebra. 
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Theorem 6.31 allows us to isolate each prime ideal P in such an analysis, 
reinterpreting everything in the context of a particular p-adic field. Carrying 
through this process gives insights into the decomposition group and the nature 
of the homomorphism Gp — G. The point of this section is to explain some of 
these insights. 

We work within the setting of Theorem 6.31 except that we assume that the 
residue class fields are finite fields, as they are in the number-theory context. Thus 
let R be a Dedekind domain regarded as a subring of its field of fractions F’,, let 
K be a finite Galois extension of F with [K : F] =n, and let T be the integral 
closure of R in K. We suppose that p is a nonzero prime ideal of R and that R/p 
is a finite field. Let pT = P/--- P¢ be the prime factorization of the ideal pT 
in T; here P,,..., Py are assumed to be distinct prime ideals in 7. Let f be the 
common value of the dimension of T/P; over R/P. 

In the decomposition K @p Fy = []#_, Kp, of Theorem 6.31, the projection 
ni to the i factor on the right side is amember of K @F Fy; specifically it is the 
member of the direct product whose i" coordinate is the multiplicative identity 
of K p, and whose other coordinates are 0. The element 7; is an idempotent in 
the sense that n? = nj, and the n;’s are orthogonal in the sense that n;7; = 0 for 
i # j. The only idempotents of K @p Fy are the sums of distinct elements 7;, 
and the 7;’s are distinguished from the other idempotents in being primitive: 7; 
is not the sum of two nonzero orthogonal idempotents. 

Recall the relationship derived in the proof of Theorem 6.31 between P; and 
the element n;: the mapping £; : Fy — Kp, given by B;(x) = (1 @ x)n; for 
x € Fy isa homomorphism of valued fields, and so is the mapping a; : K > Kp, 
given by a;(k) = (k ®@ 1)n; fork € K. These facts uniquely determine P; from 
among the ideals P),..., Pe. 

We extend the action by each member o of G = Gal(K/F) to K @p Fy as the 
transformation o @ 1. Then G acts on K @- F,, manifestly keeping each element 
of Fy fixed. Since the members of G respect multiplication and addition, they 
map idempotents to idempotents in K ® Fy, sending primitive idempotents to 
primitive idempotents. Thus G permutes the elements n;. The elements x with 
nix = x are exactly the members of K p,, and hence G permutes the fields K p,. 

Lemma 6.39. In the above setting with K /F Galois, let P; be one of the ideals 
P\,..., Pg. Then a member o of the Galois group G = Gal(K /F) extends toa 
field automorphism of K p, fixing Fy if and only if it is an isometry of (K,| - |p), 
i.e., if and only if o satisfies lox|p = |X|p, forallx € K. 


PROOF. Ifo is an isometry from K into itself in the metric determined by | - |p, 
then o is uniformly continuous as a function from K into the complete space K p, 
and therefore extends to a continuous function from the completion K p, into K p,. 
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It follows from the continuity of the extension and the fact that o respects the 
operations on K that o respects the operations on K p,. These remarks apply also 
to the extension of o~!, and the extension of o~! is a two-sided inverse to the 
extension of o. Since o is the identity on F, the continuity forces the extension 
of o to be the identity on Fy. 

Conversely suppose that o extends to an automorphism of Kp, fixing Fy. Let 
us use the name o also for the extension. On K p,, the functions x +> |x| p, and 
x +> |o(x)|p are absolute values that extend | - | p on Fy. Theorem 6.33 shows 
that they must be equal, and therefore o is an isometry. 


Proposition 6.40. In the above setting with K/F Galois, let P be one of 
the ideals P},..., Py, let G = Gal(K/F) be the Galois group, and let Gp 
be the decomposition group at P. Then Kp is a Galois extension of Fy, the 
members of Gp extend to be isometries of K p that fix Fy, and the resulting map 
gp : Gp — Gal(K p/F,) exhibits Gp as isomorphic to Gal(K p/Fy). 


PROOF. Since Kp is generated by F,, and K, it is obtained by adjoining to 
F,, the same roots of the same polynomials over F that are used to generate K. 
Therefore K p/F', is a Galois extension. 

Lemma 6.39 gives us the map of Gp into Gal(Kp/F,). The map ¢ is a 
homomorphism because the extension of each member of Gp is unique. It is 
one-one because the inclusion K C Kp is one-one. 

To see that it is onto, let 0 be in Gal(K p/ Fy), and choose an element € € K 
such that K = F(&). If m(X) is the minimal polynomial of & over F,, then o (€) 
is an element of K p with m(o(€)) = 0. Consequently o (&) is a root of m(X). 
Since K /F is Galois and m(X) has one root in K, all its roots are in K. Thus 
o(&) isin K. The most general member of K is of the form q(&), where q(X) 
is a polynomial of degree less than degm(X), and g(a (&)) has to be in K also. 
Thus o is an automorphism of K fixing F. As such, o must send T into itself 
and must send P into some ideal P; of T containing pT. Meanwhile, Lemma 
6.39 shows that o is an isometry of K relative to| - |». Thus o must send P into 
itself. In other words, the restriction of o to K is in the decomposition group Gp. 


We know from Theorem 9.64 of Basic Algebra that every member o of the 
decomposition group Gp yields a member o of Gal((T/P)/(R/p)) and that 
the resulting map o +> © is a homomorphism onto. Proposition 6.40 allows 
us to reinterpret this homomorphism as carrying the Galois group of Kp onto 
the Galois group of T/P. The order of Gal(Kp/F,) is ef, and the order of 
Gal((T/P)/(R/p)) is f. Thus the kernel of this homomorphism, which is 
called the inertia group of Kp/F,, has order e. By Galois theory the fixed 
field L of the inertia group has [Kp : L] = e, L/Fy is a Galois extension, 
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and Gal(L/F,») has order f. This construction has been arranged to make 
Gal(L/Fy) = Gal((T/P)/(R/F,)). As the Galois group of a finite extension 
of finite fields, the Galois group on the right is cyclic of order f. Therefore 
Gal(L/Fy) is cyclic of order f. 

Referring back to the statement of Proposition 6.38, we might guess that the 
fixed field L of the inertia group is the unique intermediate field such that K/L 
is totally ramified and L/F is unramified. This guess is completely correct, but 
we omit the proof. 


8. Different and Discriminant 


Theorem 6.31 is the key to a “local/global” approach to handling certain kinds 
of problems in algebraic number theory and in its analog in algebraic geometry. 
To illustrate the approach and its power, we shall give in this section and in the 
problems at the end of the chapter a full proof for the Dedekind Discriminant 
Theorem (Theorem 5.5), which was left only partially proved in Chapter V. 
That theorem as stated in Chapter V says that the prime numbers p for which 
ramification occurs in passing from Q to a number field K are exactly the primes 
dividing the field discriminant. The result we obtain now”? will in fact generalize 
Theorem 5.5 significantly. In giving the details, we leave the proofs of Proposition 
6.38 and Lemmas 6.47 and 6.48 to Problems 15-19 at the end of the chapter. 

In the approach used in Chapter V, we were unable to handle primes that are 
“common index divisors” in the sense of Section V.2. Section V.4 exhibited 
an example of a common index divisor. The difficulty with the approach in 
Chapter V is that localization by itself does not ostensibly separate the primes 
from one another sufficiently for us fully to handle them one at a time. The 
completion step is a tool powerful enough to complete the separation. 

For part of this section, we shall work in the setting of Theorem 6.31, in 
which we compare two Dedekind domains whose fields of fractions are related 
by a separable field extension. The situation of eventual interest is that the two 
Dedekind domains are the rings of algebraic integers within two number fields, 
but we shall encounter also p-adic versions of this situation. Thus let R be a 
Dedekind domain regarded as a subring of its field of fractions F, let K be a finite 
separable extension of F with [K : F] = n, and let T be the integral closure 
of R in K. In this setting we shall introduce an ideal D(K /F) of T known as 
the “relative different” of the two fields, and we shall establish conditions under 
which the relative different captures fairly precisely what ramification occurs in 
passing from R to T. This is the generalized version of the Dedekind Discriminant 
Theorem and appears as Theorem 6.45 below. 


>2Dedekind’s Theorem on Differents, given as Theorem 6.45. 
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In the special case that F = Q, we shall see that the field discriminant Dx 
satisfies |Dx| = N(D(K/Q)). In words, the field discriminant is the absolute 
norm of the relative different D(K /Q) except possibly for a sign. Using the 
properties of N(-) listed in Proposition 5.4, we can read off the version of 
the Dedekind Discriminant Theorem stated in Theorem 5.5 from the results we 
establish about the relative different. 

We work with fractional ideals in F and in K. If M is any nonzero fractional 
ideal of K , we define its (relative) dual as 


M = {x € K | Trx/r(xy) isin R forall y € M}. 


Lemma 6.41. In the above setting, if M is a nonzero fractional ideal of K, 
then so is its dual M. 


PROOF. Since T has K as its field of fractions, there exists an F vector space 
basis {t,,...,t,} of K consisting of members of T. If mo is a nonzero member 
of M and mj; = tjmo, then {m, ..., m,} is an F vector space basis of K lying in 
M. Form the R submodule M; = a Rm, of M, and let {x1,..., Xn} be the 
F vector space basis of K such that Trx rf (xjmj) = 6;;. Let 


M, = {x € K | Trx/r(am) is in R for all m € Mj}. 


If we expand a general element x of K asx = ae cix;, then a necessary 
condition for x to be in M, is that cj = Trx;r(xmj;) be in R for all j. On the 
other hand, this condition is also sufficient because an element x with all c; €¢ R 
has Trx/p(xm) = ee Cram = vit rjm;. Thus My isa finitely generated 
R module with x;,..., x, as generators. Let S be the T submodule of K given by 
S= Daya Tx;. This is a finitely generated T submodule of K that contains M. 
The inclusion M > M, evidently implies that M Cc M }, and hence M Cc §. In 
this way, M is exhibited as a T submodule of the finitely generated T submodule 
S of K, and M must itself be finitely generated because T is a Noetherian ring. 


Proposition 6.42. In the above setting, the dual T of T is of the form T = 
D(K/F)~! for an ideal D(K /F) of T. This ideal D(K /F) has the property that 


M = M'D(K/F)"! 


for every nonzero fractional ideal M of K. 


REMARK. The ideal D(K/F) in T is called the relative different of K with 
respect to F. 
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PROOF. From the definition, T consists of all x in K for which Trg F(x?) is 
in R; any member x of T has this property, and thus T C T. Lemma 6.41 shows 
that T is a fractional ideal of K. Since T contains T , it is the inverse of an ideal 
of T. This ideal we define as D(K/F). 

Let M be an arbitrary nonzero fractional ideal of K. Since M~ 'M = = T,we 
have Trx;p(M~'D(K/F)!- M) = Trxjp(D(K/F)~ = Trx/r(TT) CR, 
and it follows that M~ ID(K/F)7! ¢ M. For the reverse inclusion, let x be in 
M. Then Trx;p(xM -t) C Trxsr(xM) CR t for allt € T, and hence xM C 
T= D(K /F)~ |. This being true for all x € M, we obtain MM C D(K/F)7!. 
Therefore MC M-!D(K/F)!. 


Proposition 6.43. In the above setting, if L is a field with F C L C K, then 
D(K/F) = D(K/L)D(L/F) 
as an equality of fractional ideals in K. 


REMARKS. Let U be the integral closure of R in L. In the displayed line of the 
proposition, D(L/F) is an ideal in U, and the right side amounts to the product 
in T given by D(K/L)-D(L/F)T. 

PROOF. We use the fact that traces can be computed in stages. An ele- 
ment x of K is in D(K /F)7! if and only if Trx;r(@T) C R, if and only if 
Tryp (Trx/x(xT)) © R, if and only if Trg;,(«T) S U = D(L/F)-', if and 
only if Trx;p(xTD(L/F)) C U, if and only if xT D(L/F) C D(K/L)~!. Thus 
D(K /F)~'D(L/F) = D(K/L)~', and the result follows. 


The main result of this section, from which the Dedekind Discriminant The- 
orem will be derived as Corollary 6.49, is Theorem 6.45 below, Dedekind’s 
Theorem on Differents. The proof requires some preparation. Two results will be 
used to reduce Theorem 6.45 to a statement about complete fields, for which only a 
single prime ideal is involved, both for R and for T. The first of these is Theorem 
6.31, or more particularly its consequence for traces given in Corollary 6.37c. 
The other is the following strengthening of the Weak Approximation Theorem in 
the presence of additional hypotheses. The reduction step to a statement about 
complete fields then appears as Corollary 6.46. 


Theorem 6.44 (Strong Approximation Theorem). Let F be a number field, 
let R be its ring of algebraic integers, let P1,..., P, be distinct nonzero prime 
ideals in R, and let vp, for each j be the valuation of F and of its completion that 
corresponds to P;. Ifd,,...,, are integers and if x; for 1 < j <r is a member 
of the completed field Fp,, then there exists y in F such that 


up (y —xj) 2]; forl<j<r 


and such that vg(y) = 0 for all other nonzero prime ideals Q of R. 
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REMARKS. 

(1) It will be helpful to have a name for the property in the conclusion of 
Theorem 6.44. Thus let T be a Dedekind domain regarded as a subring of its 
field of fractions K. We say that T has the strong approximation property if 
whenever distinct nonzero prime ideals P;,..., P, of T are given, along with 
integers /;,...,/- and members x; of the completed field K P, forl <j<r, 
then there exists y in K such that vp,(y — x;) = Jj for 1 < j <r and such that 
vg(y) = 0 for all other nonzero prime ideals Q of T. The content of Theorem 
6.44 is that the ring of algebraic integers in any number field has the strong 
approximation property. 

(2) More generally any principal ideal domain has the strong approximation 
property. In fact, if R is a principal ideal domain with field of fractions F’, if K 
is a finite extension of F, and if T is the integral closure of R in K, then K is 
a Dedekind domain (according to the remarks with Proposition 6.7), and K has 
the strong approximation property. The proof is an easy adaptation of the proof 
below, with the principal ideal domain substituting for the ring Z of integers. As 
a consequence if k is a field and if T is the integral closure of k[X] in a finite 
extension of k(X), then T has the strong approximation property. 

(3) Any Dedekind domain with only finitely many prime ideals has the strong 
approximation property as an immediate consequence of the Weak Approximation 
Theorem (Theorem 6.23). One does not need to make use of the fact that such a 
domain is always a principal ideal domain. 

(4) For a number field the conclusion of the theorem as stated imposes a 
limitation on all the nonarchimedean absolute values. The conclusion cannot be 
strengthened to impose a limitation on all equivalence classes of absolute values, 
since the Artin product formula (Theorem 6.51 below) imposes a constraint on 
the set of all of them. 


PRroor.”? We may assume that each /; satisfies /; > 0. Recall that for each 
prime number p, there are only finitely many prime ideals P in R with PN Z= 
pZ. Possibly by moving some of the conditions vg(y) = 0 into the displayed 
hypothesis concerning the P;’s, we may assume that there is some finite set 
{P1,.-+, Pq} of primes such that {P},..., P,} consists exactly of all prime ideals 
P such that PZ = p;Z for some i with 1 <i <q. 

Application of the Weak Approximation Theorem (Theorem 6.23) to the ab- 
solute values corresponding to P;,..., P, produces an element z € F with 


?3This proof is from Hasse’s Number Theory, pp. 379-380. The argument for R = Z and all 
1; = 0 is the key. After an application of the Weak Approximation Theorem, what has to be shown 
is that if P; = pjZ for 1 < j <r and ifa rational ab7! is given, then there exists a rational mn! 
with / prime to pi, ..., p, such that the denominator of ab! — mn7! is divisible only by the primes 
Pi,---, pr. Another proof of Theorem 6.44, which appears in other books, uses the theory of adeles 
and ideles to be developed in the next two sections, and again the argument for Z is the key. 
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up,(Z— xj) = hj forl<j<r. 


Form the fractional ideal zR in F, and let its unique factorization be zR = 
Pelee -++ P&rQy Q;', where the a; are in Z and where Q; and Q> are ideals of R 
whose prime factorizations involve no P;. Let us see that Q2 divides a nonzero 
principal ideal (NV) of R whose generator N is in Z and that N can be chosen to 
be relatively prime to p;,..., pq. In fact, it is enough to treat each prime factor 
of Q» separately and multiply the results. For a prime factor P, we know that 
PNZ = pZ for some prime p in Z, and we know that pR is the product of P and 
another ideal of R. This prime p is nonassociate to each of p,..., pg because 
the only prime ideals whose intersection with Z is some p;Z are P;,..., P, and 
because no such prime ideal divides Q2. Therefore the prime factorization of 
(N) contains no factor P;,..., P,. 

Let b be a positive integer to be specified, and choose an integer / such that 
IN = 1mod p? for 1 <i < q. If p;R factors as [], Ro with each P;, in 
{P,,..., P,}, then / has the property that /N — 1 lies in (T], P. rey , hence in 
each PP Consequently /N — 1 lies in P? forl<j<r. 

We show that if b is sufficiently laige, then the element y = /Nz is the 
element we seek. First consider nonzero prime ideals Q notin {Pi,..., P-}. Our 
factorizations of zR and (NV) show that yR = 1030)P;" -++ P@ | The power of 
Q on the right side is > 0 because Q,; and Q3 are ideals of R, and thus 


vo(y) = 0. (*) 


Now write y — x; = (UN — 1)z + (z — xj), and apply the valuation vp,. Then 
we have 
up (y — xj) = min (vp, (IN — 1)z), vp,(z — x;)), 


and it follows from vp,(z — x;)) = /; that 
vp, (y — x) =] (4) 


if we can arrange that 
vp, (UN — 1)z) > by. (*) 


Since /N — 1 lies in PP and since vp,(z) = aj, a sufficient condition for (+) is that 
b+a; = 1;. As j varies, we impose only finitely many conditions on b to get (+) 
to hold for all 7, and then the result is that («*) holds for all 7. In combination 
with (*), this inequality shows that y has the required properties. 


The preparation is all in place to prove Dedekind’s Theorem on Differents, 
from which we shall easily derive the Dedekind Discriminant Theorem. The 
statement is as follows. 
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Theorem 6.45 (Dedekind’s Theorem on Differents). Let R be a Dedekind 
domain regarded as a subring of its field of fractions F’, let K be a finite separable 
extension of F with [K : F] = n, and let T be the integral closure of R in 
K. Suppose that T has the strong approximation property. Let p > 0 be the 
characteristic of the residue class field of R/p, let p be a nonzero prime ideal 
in R, let pT = P'--- Py be the factorization of pT as the product of positive 
powers of distinct prime ideals in T , and let the relative different of K /F split as 


D(K/F) = PY Moree P,s QO for an ideal Q relatively prime to all P;. Then for each 
j with 1 < j < g,e’ is given by 


y 
ée.= 


ej —1 if p does not divide e,, 
j 


ej with ej = ej if Dp divides ej. 


Consequently D(K /F) has all e = 0 if and only if e; = 1 for all j. 


The idea is to reduce Theorem 6.45 to the case of complete fields. In the 
notation in the statement of the theorem, the prime ideals P),..., P, are exactly 
the prime ideals of T that divide pT, and it is customary to write P; | p for these 
prime ideals of T and only these. If M is a nonzero fractional ideal of K and if 
M= pk Maa. Py *Q with Q a fractional ideal whose factorization involves no P,, 
we define the p component of M to be 


Mp = PH... Pee. 
The understanding in the special case that all k; are 0 is that M, is taken to be T. In 
all cases, M is then the product over all p of its p"" component, since the complete 
factorization of M has nonzero exponents for only finitely many nonzero prime 
ideals of T. For the two examples that appear in the statement of Theorem 6.45, 


OMp=[[ Pi and DK/P),p= TP’ 
Pilp Pilp 
The reduction of Theorem 6.45 to the case of complete fields results from the fol- 
lowing proposition, which combines Theorem 6.31 and the strong approximation 
property (Theorem 6.44 in the case of number fields). 


Proposition 6.46. Let R be a Dedekind domain regarded as a subring of its 
field of fractions F’, let K be a finite separable extension of F with [K : F] =n, 
and let T be the integral closure of R in K. Suppose that T has the strong 
approximation property. If p is any nonzero prime ideal in R, then the different 
D(K /F) has the property that 


D(K/F) =| [| [D(Ke/Fp). 


Pp Pip 
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the outer product being taken over all nonzero prime ideals p of R and the inner 
product being taken over all prime ideals P of T containing pT. Here the fields 
Kp and F, are the completions of K and F corresponding to P and p, respectively. 


PROOF. We actually will show equality of the inverses of the two sides of the 
displayed formula. By the first conclusion of Proposition 6.42, we are to show 
that a member x of K has 


Trxjr(xT) CR if and only if Trxp/F, (XT )i) © Rp (x) 


for all p and all P with P|p. Here (-); refers to the embedding K — Kp, in 
Theorem 6.31 given by —& +> & = n;(1 @ &), where n; is the i projection. To 
prove (*), we use the formula of Corollary 6.37c, namely 


g 
Trxjr(&) = )) Trxp/r, (6) for all& < K. (4) 
i=l 


This formula is valid for every p. 

First suppose that Trx,/r,((*T);) © Ry for all p and all P with P |p. Fix p, 
and put € = xt witht ¢ T. Summing the traces over P with P |p and applying 
(«*), we see that the valuation with respect to p of the member Trx/-(&) of F 
is > 0. That is, the factor p* that appears in the factorization of the principal 
fractional ideal Trx/-(€)R of F has k > 0. This being true for all p means that 
Trx/r(€)R is an ordinary ideal. Hence Trx/-(&) is in R. 

In the reverse direction, suppose that Trx;-(xT) © R. For each nonzero prime 
ideal P in T , let vp be the corresponding valuation. Fix p. Let {P;,..., P,} be the 
set of P’s with P |p. Now fix i. By the assumed strong approximation property 
of K, there exists an element y in K with 


up(y — x) = max(vp,(x), 0), 
vp,(y) > max(vp,(x),0) for j #i, 
vo(y) = 0 for all prime ideals Q ¢ {P},..., Po}. 


Let us see that vp, (yx7!) > 0 for all j. For j + i, this is immediate because 
up,(y) = up, (x). For j =i, we compute that 


vp,(yx—! — 1) = vp,(y — x) — vp, (x) > max(vp, (x), 0) — vp, (x) 
= max(0, —vp,(x)) = 0, 


and then we see that vp, Ox) min(vp, Gxt =1), up,(1)) = 0. 
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With y now fixed, we make use of the strong approximation property of K a 
second time, obtaining an element z in K with 


up,(Z — yx!) > max(vp,(x~'), 0) forl<j<g, 
vo(z) = 0 for all prime ideals Q ¢ {P},..., Po}. 


Since vp,(yx—!) > 0 and vp(z — yx~!) > 0 for all j, we find that vp,(z) > 0 

for all 7. From vg(z) = 0 for all other Q, we conclude that z is in T. Since 

Trgjr(xT) C R, Trx;p(xz) lies in R. The trace formula (**) therefore shows 

that F 
SS TrKp, /Fp (xjZ;) lies in Ry. (+) 
j=l 


Meanwhile, we have 
TrK p,/Fy (XjZj) = TeKp, Fy ij — YixZ')) + eK, [Fy (Vs) rH) 


for 1 < j < g. Forall j, the first term on the right side of (77) lies in Ry because 
the definition of z makes vp, (x(z — yx7!)) > 0. For j # i, the second term 
on the right side lies in Ry because of the definition of y. Thus (++) shows that 
Trk,, /F, (XjZ;) lies in Ry for j # i. Comparing this conclusion with (+), we see 
that Trr HF (x;z;) lies in Ry. Resubstituting into (++), we find that 


TrKp,/F, i) lies in Ry. (4) 


Finally the definition of y shows that vp, (y — x) > 0. Hence TrKp,/Fp (yi — Xi) 
is in Ry. Combining this fact with (+), we conclude that Trx », /Fp (x;) is in Ry. 
Since i is arbitrary, TT Kp,/Fp (xj) isin Rp forl <j < g. 


With the proof of Theorem 6.45 reduced to the case of complete valued fields 
by Proposition 6.46, we need to make use of Lemmas 6.47 and 6.48 below, whose 
proofs are carried out in Problems 17—19 at the end of the chapter. 


Lemma 6.47. Let F be a complete valued field with respect to a discrete 
nonarchimedean valuation, let R be its valuation ring, let p be its valuation ideal, 
let K be a finite separable extension of F with [K : F] =n, let T be the integral 
closure of R in K, and let P be the unique nonzero prime ideal in T. Suppose 
that K /F is totally ramified with pT = P° for an integer e > 1, and suppose that 
the isomorphic residue class fields R/p and T/P are finite fields of characteristic 
p. Then the different D(K /F) is given by D(K/F) = P* , where 


; | e—1 if p does not divide e, 
a 


é with e > e if p divides e. 


8. Different and Discriminant 379 


Lemma 6.48. Let F be a complete valued field with respect to a discrete 
nonarchimedean valuation, let R be its valuation ring, let p be its valuation ideal, 
let K be a finite separable extension of F with [K : F] =n, let T be the integral 
closure of R in K , and let P be the unique nonzero prime ideal in T. Suppose that 
K/F is unramified, i.e., has pf = P, and suppose that the residue class fields 
R/p and T/P are finite fields of characteristic p. Then the different D(K /F) 
equals 7. 


PROOF OF THEOREM 6.45. Proposition 6.46 shows that 


D(K/F)y = | | D(Ke/Fp). (*) 
Pip 


Thus consider an extension K p / fF, of complete valued fields. Let L be the inertia 
subfield of K p/ Fy as given by Proposition 6.38. The intermediate field L has the 
properties that K p/L is totally ramified and that L/F is unramified. 

Let U be the integral closure of R in L, and let 9 be the unique nonzero 
prime ideal in U. The properties of L make fT = P* for a suitable integer 
e=e(P|g),T/P = U/g, and pU = g. Lemmas 6.47 and 6.48 tell us that 
D(L/Fp) = U and that D(Kp/L) = P* , where 


: (*) 


, e-l if p does not divide e, 
a 
é with e > e if p divides e. 


Problem 33 at the end of Chapter IX of Basic Algebra shows that ramification 
indices multiply for successive extensions. Thus e(P |p) = e(P | p)e(e|p) = 
e-1 =e. Proposition 6.43 shows that differents multiply in corresponding fashion. 
Therefore D(K p/Fy) = D(Kp/L)D(L/Fy) = P°U = P*. Substituting into 
(*), we obtain 


D(K/F)p =  D(Kp/Fy) = QB PoP, 


Pip Pip 


where e’(P | p) is the integer e’ of («*) when e = e(P |p). This proves Theorem 
6.45 for the p" component of D(K/F). Since p is arbitrary and only finitely 
many components can be unequal to 7, the theorem follows. 


Corollary 6.49 (= THEOREM 5.5, Dedekind Discriminant Theorem). Let K 
be a number field, let T be its ring of algebraic integers, let p be a prime number, 
and let (p)T = P;'--- P,’ be the factorization of (p)T as the product of powers 
of distinct prime ideals in 7. Then e; is greater than 1 for some j if and only if 
p divides the field discriminant Dx. 
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PROOF. Let us observe first that the discriminant Dx is given up to sign by the 
index |7 /T|. In fact, T is a torsion-free finitely generated abelian group and hence 
is free abelian of rankn = [K : Q], say with an ordered Z basis T = (41, ..., Xn). 
Since the Q bilinear form (x, y) +> Trx/g(xy) is nondegenerate on K , there exists 
an ordered basis A = (y1,..., Yn) Of K with Trx/Q(xiy;) = 4;;. Let us write 
xj = >); aijyi with all a;j in Q. According to Proposition 5.1, Dx equals the 
discriminant D(I’) of I’, defined in Section V.2 by D(T) = det[Trxg(@ix;)]ij;. 
Substituting xj = 0, aijy;, we obtain 


Dr = det [ ay Trx/o@iye)];, = det | >? axjSix];; = det[a;;]i;. 
k k : 


Thus |Dx| = |T/T| = |D(K/Q)"!/T 
In a moment we shall show that 


, as asserted. 


\D(K/Q)'/T| = |T/D(K/QI, (*) 


from which we conclude that |Dx| = N(D(K /Q)). Assuming (*), we continue. 


Unique factorization of ideals allows us to write D(K /Q) = ve Keus Py *O, where 
Q isan ideal relatively prime to (p). Combining the equality Dy = N(D(K /Q)) 
with Proposition 5.4 shows that 


iy 


g e 
Dx = N(D(K/Q) = N(Q) [] NCP’) = NO) 
j=l 


& 
pi, 
=1 


where N(Q) is an integer not divisible by p and where f; = dimp, (T/P;) for 
1 < j < g. Consequently Dx is prime to p if and only if e = 0 for all j. If we 
take into account that T has the strong approximation property as a consequence 
of Theorem 6.44, then application of Theorem 6.45 completes the proof of the 
present corollary except for the verification of (*). 

Thus we are left with proving that |D(K/Q)!/T| = |T/D(K/Q)|. More 
generally we shall show that 


[1/7 | = T/T () 
for every nonzero ideal J in T. In turn, we shall deduce (*«) after showing that 
|M/PM|= N(P) (1) 
whenever M is a nonzero fractional ideal in K and P is a nonzero prime ideal 


in T. We do so by showing that M/P M is a vector space over the field T/P of 
dimension |. It is evident that T carries M to itself and PM to itself, and that 
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P carries M to PM. Thus the action of T on M/PM descends to an action of 
T/P onM/PM. The vector space M/PM is not 0 because M # PM by unique 
factorization of fractional ideals. To see that M/PM has dimension at most 1, 
fix an element x of M that does not lie in PM. Then xT + PM is a fractional 
ideal of K that is contained in M + PM = M and contains PM and a member 
of M that is not in PM. Hence it equals M. Accordingly, if y € M is given, we 
can choose t € T such that xt — y isin PM. Then (t+ P)(x+ PM) = y+ PM, 
and 7/P carries x + PM onto M/PM. So M/PM is 1-dimensional over T/P, 
and (+) follows. 

Returning to (**), let J = Q; --- Q; express J as the product of nonzero prime 
ideals. Iterated application of (*) and the First Isomorphism Theorem gives 


[2-'/T| = |I7'/Oy ++ Ont! | = I/O Ox-1 TIN (Ox) 
= |I7'/Q1 ++» Qx-2I7"|N(Q)N(Qx-1) 


k 
= SINT NO) = NO. 
I= 


This proves (>) and therefore also (*). 


One more point needs explanation. The discussion in Section IX.17 of Basic 
Algebra concerned a monic irreducible polynomial F(X) in Z[X] and its reduc- 
tion F(X) modulo p, and the interest was in the Galois group G of the splitting 
field K’ of F(X) over Q. Theorem 9.64 of that book dealt with the natural 
homomorphism from a decomposition subgroup G p of G onto the Galois group 
G of the splitting field over F, of F(X), and it was asserted without proof that 
this homomorphism is one-one if p does not divide the discriminant of F(X). 
The order of the kernel of the homomorphism was identified as the common 
ramification index of the prime ideals P’ containing (p)R’, R’ being the ring 
of algebraic integers in K’. Let K = Q[X]/(F(X)). Except in the quadratic 
case, the field K typically has much lower dimension over Q than K’ does. The 
Dedekind Discriminant Theorem relates Dx to ramification relative to K, as well 
as Dx to ramification relative to K’. We know that primes not dividing the 
discriminant of F(X) do not divide Dx, but we need a proof that primes not 
dividing the discriminant of F(X) do not divide Dx’. 

To approach this question, one needs the notion of “relative discriminant” anal- 
ogous to that of “relative different” for an extension K/F of number fields. The 
relative different is defined so as to be an ideal for K, and the relative discriminant 
is an ideal for F. (The field discriminant is the generator of the relative discrimi- 
nant for K/Q with the appropriate sign attached.) One proves that the behavior 
of the relative discriminant under successive extension is reasonable, just as it is 
for degree of extension, ramification indices, residue class degrees, and relative 
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differents. These results show that if Q C K C L, then the field discriminant for 
K divides the field discriminant for L. The next step is to extend the notion of 
field discriminant so that it applies to commutative semisimple algebras and to 
show that the discriminant of a tensor product over Q of finitely many number 
fields is a certain function of the field discriminants and dimensions of the factors. 
Finally we return to F(X) and its splitting field K’. Let € be a root of F(X) in 
K’, and let 0; (&), ..., On (&) be the distinct conjugates of €. Then K’ is generated 
by the subfields Q(é1),..., Q(&,), and the (Q multilinear) multiplication map 
extends to an algebra homomorphism of Q(&1) ®g --- ®g Q(En) onto K’. As 
the tensor product of commutative semisimple algebras in characteristic 0, this is 
commutative semisimple (Corollary 2.37) and is therefore a direct sum of fields 
(Theorem 2.2). Thus we can regard K’ as a subfield of the tensor product of fields 
isomorphic to Q[X]/(F (X)), and the discriminant of K’ divides the discriminant 
of the tensor product. Putting everything together, we see that the only possible 
primes dividing Dx: are the primes that divide Dx. Therefore the primes that fail 
to divide the discriminant of F(X) do not ramify in R’. 


9. Global and Local Fields 


A global field K is either a number field, i.e., a finite extension of Q, or a function 
field in one variable over a finite field, i.e., a finite extension of some F, (X), where 
F, is a finite field.* An example of the latter is 


K =F,(x)Lyl/Q? — 63 — x) & Fp(x) [v3 — x]. 


In this section we shall develop some machinery for working with global fields. 
Our interest at present is in number fields, but function fields in one variable are 
the object of study in Chapter IX. Consequently the results will be stated for 
all global fields as long as all global fields can readily be treated together, and 
thereafter we shall specialize to number fields. 

The virtue of global fields for current purposes is that their completions with 
respect to nontrivial absolute values are always locally compact with a nontrivial 
topology. In the case of number fields, we know this for archimedean absolute 
values by Proposition 6.27, and it follows for nonarchimedean absolute values 
by Corollary 6.21 and Theorem 6.26. In the function-field case as above, the 
completions have to be nonarchimedean by Proposition 6.14, and their absolute 
values have to be discrete by Corollary 6.22; then the residue class fields are always 


?4Tt will be shown in Chapter VII that a function field in one variable over a finite field is always 
a finite separable extension of F,(Y) for a suitable indeterminate Y. 
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finite, and Theorem 6.26 shows that the completions are all locally compact with 
a nontrivial topology. 

To study a global field K in the style of this chapter, one studies simultaneously 
the completions” of K with respect to one absolute value from each equivalence 
class.2° Two completions are said to be equivalent completions if the absolute 
values on the domains of the completion maps are equivalent in the sense of Sec- 
tion 3. An equivalence class of completions of nontrivial absolute values is called 
a place of K. A place is called archimedean or nonarchimedean according as 
the corresponding absolute values are archimedean or nonarchimedean; in the 
archimedean case it is called real or complex according as the locally compact 
completed field is R or C. 

Because of the special hypotheses for the situation with global fields, we shall 
see that to each place corresponds a distinguished choice of an absolute value 
on K from the equivalence class, called the normalized absolute value in the 
class.7”7. These normalized completions are glued together*® in a fashion to be 
described in the next section to form the ring of “adeles” of K and the group of 
“ideles” of K . Historically ideles preceded adeles, and ideles were introduced in 
order to reinterpret class field theory and improve upon it; convincing motivation 
is therefore not readily at hand without knowledge that extends beyond this book. 
However, we can get some advance insight into how adeles and ideles might be 
useful from the first part of the classical proof of the Dirichlet Unit Theorem 
(Theorem 5.13) as given in Section V.5. 

That proof in effect handles archimedean places in a way similar to the way 
that adeles handle all places. In more detail let K be a number field of degree 
n over Q, and let R be its ring of algebraic integers. In Chapter V we usually 
regarded K as a subfield of C, but we shall not do so here. As was observed 
in Section V.2, there exist exactly n field mappings of K into C, and we denote 
them by o,,...,0,. If x is in K, then the images o;(x),...,0,(x) are called 
the conjugates of x. Among o1,...,0, are r; real-valued mappings and r 
complex conjugate pairs, with r; + 2r7 = n. Let us number the mappings so that 
O1,...,0,, are real-valued and so that 0;,41,..., 0r,4r, pick out one from each 
complex conjugate pair. Proposition 6.27 shows that the functions x +> |o1(x)|, 


>> It is important not to lose sight of the fact that a “completion” is a certain kind of homomorphism 
of valued fields and does not consist merely of the range space. 

©The completion of the trivial absolute value is excluded. 

27The range of each completion is a locally compact field whose topology is not the discrete 
topology. Such a field is often called a local field in books. Examples are R, C, p-adic fields, and 
fields Fg ((X)) of formal Laurent series. One can show that there are no other locally compact fields 
whose topology is not discrete. The definition of “local field” in some books is arranged to exclude 
Rand C. 

>8It is tempting to think in terms of the gluing as involving just the locally compact fields, but 
the completion mappings play a role and that description is thus an oversimplification. 
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. ,X F* |0;,47,(%)| are a complete set of representatives for the archimedean 
places of K; the first 7; are real, and the last rz are complex. 

Just before Lemma 5.17 we introduced the mapping ® : K — R” x C” given 
by 


BH) (sie iiiron pot On) forx € K. 


Lemma 5.17 observed that the image ®(R) of R is a lattice in R™ x C? = R". 
The starting point for proving the Dirichlet Unit Theorem in Section V.5 was to 
apply the Minkowski Lattice-Point Theorem to this lattice ®(R). Proposition 
6.27 allows us to interpret the mapping ® as the natural embedding of K into the 
product of its completions at all archimedean places. 

The ring of adeles of K will be a corresponding space for dealing with com- 
pletions with respect to all nontrivial absolute values, archimedean and nonar- 
chimedean. 

While we have the archimedean places of the number field K at hand, let us 
address the question of their normalized representatives. Since the field maps 
from K into C given by o;,41,....,,4r. are equal to the complex conjugates of 
Or,4tryt1s+++s On, every member x of K has 


ue rl rytro 
No(s) = T] oj) = (Tl o@))( TT lo@)1’). 


j=l j=l j=rit+l 


This formula can be viewed as an archimedean analog of the formula in Corollary 
6.37b. The number field Q has one archimedean place, and ordinary absolute 
value is taken as its normalized representative. We denote this representative by 


| - |45- With | - | denoting ordinary absolute value on R and C, we obtain 
ry rytr2 > 
Nao = (Th le@l)( TT 1e@)?). 
j=l jartl 


It is customary to use letters like v and w as indices for places. The real places 
are the completions x +> o;(x), 1 < j <1,,of K into R, and the normalized 
absolute value on K for a real place is the pullback from ordinary absolute 
value on R. Thus if | - |, denotes ordinary absolute value on R and if v is a 
real place corresponding to o;, then we define |x|, = |o;(*)|p for x € K. The 
normalization to use for the complex places is motivated by the formula above. 
Ifr; +1 < j <r) +12, then o; in effect contributes twice to the above formula, 
once from j and once from j +72, and the notion of normalized absolute value is 
to take this double contribution into account. Thus we write | - |. for the square 
of the ordinary absolute value on C; this quantity is not really an absolute value, 
since the triangle inequality fails for it, but it has too many desirable features to 
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be ignored. We define the normalized absolute value on K for a complex place 
to be the pullback from this function | - |; on C even though the result fails to 
satisfy the triangle inequality. Thus if v is a complex place corresponding to o; 
withr; +1 < j <7, +71, then we define |x|, = |oj(x)|¢ = |oj(x)|? forx eK. 
With these definitions of normalized absolute values for archimedean places, the 
formula above for |N-/g(x)|,,. can be rewritten as 


ri rytr2 
INK/OM)lo = (Mle @la)( TE le @le) = (TL eh)C TL bl,)- 
j=l j=ritl v real v complex 


We summarize matters in the following proposition. 
Proposition 6.50. If K is a number field, then 


INF/QAX) loo = Th. “bel forx € K, 


v archimedean 


where | - |,, is the pullback of | - |p , the ordinary absolute value, for real places 
and where | - |, is the pullback of | - |, , the ordinary absolute value squared, for 
complex places. 


At this point we could give a definition of normalized absolute value corre- 
sponding to nonarchimedean places. But we shall digress in order to motivate 
the definition using concepts from measure theory that may be known to some 
readers and not to others. These concepts play a role within the text only in the 
next paragraph and in Example 4 of normalized discrete absolute values below, 
and the reader will not miss any results or proofs by skipping this material. 

The digression begins. Any locally compact group has a nonzero measure 
on it that is invariant under left translation,?? and this measure is unique up to 
multiplication by a scalar. Let a locally compact field L be given, and let yz be 
an invariant measure of this kind with respect to the additive group of L. Each 
nonzero element c of L has the property that jz(cE) is a multiple of z(£) that 
is independent of E. If we write |c|, for this multiple and put |0|, = 0, then it 
turns out that some power | - | with 0 < a < 1 is necessarily an absolute value 
and that this power w can be taken to be | in all cases except when L = C. In the 
case of C, it is easy to check that |c|¢ = |c|*, and the triangle inequality therefore 


2° Although the details will not be important for us, let us be more precise: The measure is on 
the o-algebra of “Baire sets” on the group—the smallest o-algebra containing those compact sets 
that are intersections of countably many open sets. The measure is not the 0 measure, it is finite on 
all the generating compact sets, and it takes the same value on a set as it does on any left translate 
of the set. It is called a left Haar measure. For more information, see the author’s Advanced Real 
Analysis, Chapter VI. 
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fails for a = 1. But in all other cases, | - |, is a canonical choice for an absolute 
value on L. Now suppose that w : K — L isa field map of a global field K onto 
a dense subfield of a locally compact field. We impose this special absolute value 
| - |, on L. Then a necessary and sufficient condition on an absolute value | - | x 
for w: (K,|- |x) > (L,| - |,,) to be a completion is that | - |, = w*(| - |,). 
In other words, the pullback of the special normalization of the absolute value on 
the locally compact field is the natural normalization to use for the absolute value 
on the global field. 

With the digression now over, we want to associate to each nonarchimedean 
place of a global field a special normalization of an absolute value. (We handled 
the question of normalization at archimedean places earlier in the section.) We can 
be a bit more general. Suppose that F is an arbitrary field with a discrete valuation 
v and with corresponding nontrivial absolute value given by |x|, = r~’°? for 
somer > 0. Let R be the valuation ring and p the valuation ideal; p is a principal 
ideal of the form (zr) for some a € R. Suppose that the residue class field R/p is 
finite. Then we say that | - |,, is normalized if ||, = |R /p|~!. This definition 
is independent of the choice of z. 


EXAMPLES OF NORMALIZED DISCRETE ABSOLUTE VALUES. 


k when 


(1) The field Q and the p-adic absolute value given by |ab™! p* L=pP- 
a and b are integers prime to p. The valuation ring R consists of all ab! with 
a € Z,b € Z, and b prime to p. The valuation ideal consists of all such ab"! 
with a divisible by p, and the quotient R/p is isomorphic to F,. The element 
a may be taken to be p, and |p|, equals p~', which equals |R/p|~!. Thus the 


p-adic absolute value on Q is normalized. 


(2) Let K be anumber field of degree n over Q, and let T be its ring of algebraic 
integers. Let p be a nonzero prime ideal in 7, and let v be the corresponding 
valuation of K. Let g = |T/p|, and define xl, = gq”. Then | - lp is 
normalized because Theorem 6.5e shows that the residue class field obtained 
from the valuation is isomorphic to 7/p. 

(3) Let K = F,(X), fix a prime polynomial c(X) in F,[X], and consider 
the absolute value on K defined by la(X)b(X)~!e(X)*| = q7* 8°) whenever 
a(X) and b(X) are polynomials relatively prime to c(X). This example runs 
completely parallel to the two previous examples, and z may be taken to be 
c(X). The residue class field has as representatives all polynomials h(X) with 
deg h(X) < degc(X) and thus has order g“°?°, This order matches |c(X)|7~!, 
and hence | - | is normalized. 


(4) If F is a locally compact field whose topology comes from some nontrivial 
discrete absolute value with finite residue class field, then the canonical absolute 
value | - |, described in the digression above and obtained from an invariant 
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measure j4 on the additive group of F is normalized. To see this, let R and 
p be the valuation ring and valuation ideal, and write p = (7). Putm = 
|R/p|, and let x;,...,X,, be representatives of the m cosets of R/p in R. Then 
w(x; +p) = w(p) for 1 < j < m by translation invariance of jz, and hence 
UCR) = pa (x; +p) = mu(p). Substituting and using the definition of | - |, 
gives u(p) = w(wR) = | |,-u(R) = |1|,-mp(p). The number ju(p) is positive, 
since p is a nonempty open subset of F’, and we can cancel to get |z|,m = 1. 
Thus |z |, = |R/p|7!, and | - |; is normalized. 


Theorem 6.51 (Artin product formula). If F is anumber field and if normalized 
absolute values are used, then 


I] |x|, =1 for all nonzero x € F, 
Vv 


the product being taken over all places v. In this product, only finitely many of 
the factors can be different from 1. 


REMARKS. A version of this theorem is valid for function fields in one variable. 
As Corollary 6.22 permits, one can state this analogous theorem in terms of 
discrete valuations that are trivial on the base field, and absolute values need play 
no role. The precise statement and proof appear in Chapter IX. Corollary 6.9 in 
the present chapter is a special case. 


PROOF. First we prove the result for Q. Let a rational y = + pi tee pk be 
given; here pi,..., p, are distinct primes. The product [],,|y|,, is taken over 
all places, hence over all primes and the one archimedean place oo. For this 
y € Q we have |y|p, = Be for 1 < j <r and |y|» = 1 for all other 
primes p’. SoTT, prime IY lp = pi": 
Tan» lvl, =1. 

Let R be the ring of algebraic integers in F. Given x in F, factor the fractional 
ideal xR. The nonarchimedean places correspond to the nonzero prime ideals 
in R, and |x|,, is 1 except for the v’s corresponding to those prime ideals in the 
factorization. There are only finitely many of these. Since also there are only 
finitely many archimedean places, we see that |x|,, = 1 for all but finitely many v. 

Let us consider the nonarchimedean places separately from the archimedean 
ones. The nonarchimedean places correspond to nonzero prime ideals go, and we 
group these according to the prime number p such that 9 Z = pZ, writing 
§ | pZ for this correspondence. For fixed p and for each go with | pZ, let 
Xg be the image of x under the local embedding in F,. Corollary 6.37b gives 
Nr/o(x) = Hryipz NF,/Q,(p). Theorem 6.33 shows that Xl pr, is a power 


= : k . 
-- p-*. Since |y|,, = pj’ +--+ p& ,we obtain 


of |N Fo/Qp ApIlo,- To determine the power, we observe from Example 2 that 
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the canonical absolute values on Q, and F, are normalized, and we specialize 
IXolp, and |NF,/9, pla, t Xe in Q,. Making the comparison, we find that 
INF, /0, Xp) le, = |Xol Fo" We know that each local embedding respects absolute 
values; since Theorems 6.5e and 6.26e together show that the residue class fields 
of F, and Q, have orders |R/go| and |Z/pZ|, it follows that IXpl x, = |X\p- 
Therefore 


INF/a@)|p = Wr/e@)lg, = TL Wr,/0, lo, 
9| pZ 


= T] belz, = II llp- (*) 
; 9|pZ 


g|pZ 


For the finitely many archimedean places, Proposition 6.50 gives us the formula 


INF/e@lo= I] kl, (4) 


v archimedean 


where | - |,, is the ordinary absolute value on Q. Multiplying («) and (**) and 
using the known identity [],, |y|, = 1 for the element y = Nrg(x) of Q, we 
obtain the theorem. 


10. Adeles and Ideles 


In this section we do the gluing that creates the adeles and the ideles out of the 
places of a global field. We begin with a topological construction, and then we 
superimpose the algebraic structure. The general constructions and the two main 
theorems will be valid for all global fields, but we shall discuss proofs of the 
theorems only for number fields. 

Suppose that {X; | i € J} is a nonempty family of locally compact Hausdorff 
spaces. Assume that for all but finitely many i € J we are given a compact open 
subset Z; of X;. The restricted direct product of the X;’s relative to the Z;’s is 


the subset ; 
I] Xi S I] Xj 
iel iel 
defined by 
(Xiier € I] X if and only if x; € Z; for all but finitely many /. 
iel 
The restricted direct product is topologized as follows. Suppose that S C J isa 
finite subset and that Z; is defined fori ¢ S. Put 


X(S) = [[% x [iz 


ieS i¢S 
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In their respective product topologies the first factor is locally compact, and the 
second factor is compact. Certainly X(S) is a subset of the restricted direct 
product, and evidently the restricted direct product is the union of the subsets 
X (S) over all finite subsets S for which Z; is defined wheni ¢ S. We topologize 
Ile; X: by insisting that each X (S) be an open subset.*° The resulting topology 
is locally compact Hausdorff. In fact, any two members of []}-, Xi lie in a 
common X (S$), and the open sets that separate them in X (S$) separate them in 
fe; X;. Also, any (xj)ier iS in some X(S), which is locally compact, and a 
compact neighborhood within X (S) will be a compact neighborhood in [];_, Xi. 

Now we superimpose the algebraic structure. Let K bea global field. To each 
place v of K, we have associated a normalized absolute value | - |,, on K anda 
completion ty : (K,| - |,) > (Kv, | + |x,). Each of the complete valued fields 
K, is locally compact. Except at the finitely many archimedean places, which 
occur only in the number-field case, | - |x, arises from a discrete valuation. We 
take R, to be the corresponding valuation ring, i.e., Ry = {x EK, | Ix|, < ine 
This is a compact open additive subgroup of K,. Thus we can form a restricted 
direct product in which the index set J is the set of places of K, the v™ locally 
compact Hausdorff space is K,, and the v' compact open subset is R,. This 
restricted direct product carries the structure of a commutative ring with identity, 
with its addition and multiplication defined in coordinate-by-coordinate fashion, 
and the operations are continuous. Thus we obtain a topological ring, known as 
the ring of adeles of K and denoted by Ax or simply by A when no ambiguity is 
possible. 

If foreach x € K,,, we send x into the tuple (a,), that has a,, = x anda, = 0 
for v ~ vo, then the result is a one-one continuous ring homomorphism of K, 
into A. This homomorphism of course does not send the multiplicative identity 
of K, to the multiplicative identity of A. 

The completion mappings 1, : K — K, embed K into each K,, and we can 
form a corresponding diagonal map: K — [],, Ky into the full product of K,’s 
by defining 1(x) = (t,(x))y. Actually, we shall check for x 4 0 that only finitely 
many places have |t,(x)|,, = |x|, unequal to 1, and therefore the image of the 
diagonal map is in the adeles. Thus we have a diagonal ring homomorphism 


i:KoA given by U(x) = (ly(x))y forx € K. 


The fact that in the number-field case, |x|,, is unequal to | for only finitely many 
places appears as part of Theorem 6.51. For the function-field case, the field K is 
a finite separable extension of some field F, (X), and all but finitely many places 
come from nonzero prime ideals in the integral closure R of F,[X] in K. At the 


30Tn other words, a set in I[j<y X; is open if and only if its intersection with each X (S) is open 
in X(S). 
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unexceptional such places the value of |x|,, comes by treating x R as a fractional 
ideal and factoring it; only finitely many ideals are involved in the factorization, 
and only those among all the unexceptional places can have |x|,, 4 1. The main 
structural theorem about the adeles is as follows. 


Theorem 6.52. If K is a global field, then the image of K in the adeles A 
under the diagonal mapping : : K — A is discrete, and the quotient A/i(K) of 
additive groups is compact. 


For a number field the compactness in Theorem 6.52 encodes Lemma 5.17 
and the Strong Approximation Theorem. The proof of the theorem is not hard, 
and we return to it in a moment. In the current discussion Theorem 6.52 is 
not something to appreciate for its own consequences but instead is a prototype 
for a corresponding theorem about “ideles” that encodes for number fields the 
finiteness of the class number and the Dirichlet Unit Theorem. 

The construction of the “ideles” of K proceeds similarly to the construction 
of the adeles. Again we use a restricted direct product, with the set of places as 
index set. The locally compact Hausdorff space associated to the place v is the 
multiplicative group K;*. For v nonarchimedean, we again let R, be the valuation 
ring in K,, and take the compact open subset of K * to be the group R> of units 
in Ry, ie., RE = {x E€ Ky | Ix|, = 1}. The group of ideles is the restricted direct 
product of the groups K,* relative to the compact subgroups R**. The result is a 
locally compact abelian group, known as the group of ideles of K and denoted 
by Ix or simply by I. 

Warning: As a set, I coincides with the group of units A*. However, the 
topologies do not match. The topology for I is finer than the relative topology on 
A”*. See Problems 7—8 at the end of the chapter. 

If for each x € K,,, we send x into the tuple (a,), that has ay, = x anda, = 1 
for v # vo, then the result is a one-one continuous group homomorphism of K * 
into I. As with the ideles we also have a diagonal mapping i : K * — I given by 
L(x) = (4,(x)),; the image is contained in /, since for a nonzero x € K, |x|, can 
be unequal to 1| for only finitely many v. 

The Artin product formula (Theorem 6.51) and the corresponding result for 
function fields in one variable over a finite field put a constraint on the image. We 
define the absolute value | (a, ),| of an idele (a,), to be the product of the absolute 
values of the components: |(a,),| = [|], |@vlv- This is well defined because only 
finitely many factors are allowed to be different from 1. If I! denotes the group 
of ideles of absolute value 1, then I' is a closed subgroup of I. The Artin product 
formula and its function-field analog imply that the image of the diagonal mapping 
is contained in I'. The main structural theorem about the ideles is as follows. 
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Theorem 6.53. If K is a global field, then the image of K* in the subgroup 
I! of the ideles I under the diagonal mapping 1 : K* — I is discrete, and the 
quotient group I! /.(K *) is compact. 


From now on, we suppose that the global field K is a number field. Let 
Soo be the set of archimedean places. We begin by supplying direct proofs 
of the discreteness in Theorems 6.52 and 6.53 and of the compactness of the 
quotient in Theorem 6.52. After some additional discussion we return to prove 
the compactness of the quotient in Theorem 6.53. 


PROOF OF DISCRETENESS OF 1(K ) IN THEOREM 6.52. It is enough to produce 
a neighborhood U of 0 in A such that U Ni(K) = {0}. The set U of all 
(xv)v € A such that |x,|,, < 1 for all archimedean places and |x,|,, < 1 for all 
nonarchimedean places is an open product set in A(S,.) and hence is an open 
neighborhood of 0 in A. Since Theorem 6.51 shows that [[,, |t0(y)|o = 1 for all 
y #Oin K and since [], |xv|v < 1 forall (x,), nU, U NU(K) = {0}. 


PROOF OF DISCRETENESS OF 1(K*) IN THEOREM 6.53. The set U of all 
(xy)y € Isuch that |x,—1|,, < 1 forall archimedean places and |x,—1|,, < 1 forall 
nonarchimedean places is an open product set in I(S,.) and hence is an open neigh- 
borhood of 1 in I. If (x,), = t(y) with y € K* andy 4 1,thenx,—1 =1,(y—-1) 
with y—1 # 0,and Theorem 6.51 shows that] |, |o(y)—11,=[], lo Q—-DI,=1. 
The members (x,), of U all have [],, |x) — 1], < 1, and thus U Ni(K”*) = {1}. 


PROOF OF COMPACTNESS OF A\/t(K ) IN THEOREM 6.52. We begin by observing 
that 
A= UK) + A(Soo), (*) 


i.e., that the set of sums of a member of 1(K ) and a member of A(S,.) exhausts A. 
In fact, given (x,), in A, we let vj, ..., v, be the finitely many nonarchimedean 
places for which |x, Fe > 1. The Strong Approximation Theorem (Theorem 
6.44) applied to the elements x,,,..., x), produces a member y of K such that 
ltv,) — Xvjly; < 1 forl < j <r and such that |,(y)|, < 1 for all other 


nonarchimedean places v. Consequently |ty(y)—x»|,, < 1 forall nonarchimedean 
v. This inequality means exactly that (x,), — t(y) is in A(Soo). Hence 


x = Uy) + (A) — 4) 


is the required decomposition, and (*) is proved. 
In addition, we have 
UR) = UK) N ASQ). (+) 
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In fact, the inclusion C is clear. For the inclusion 5, let y be a member of K such 
that ¢(y) is in A(Soo). Then |t)(y)|,, < 1 for all nonarchimedean v, and it follows 
that y isin R. 

To prove the compactness, we use the identity (M+N)/M = N/(MNN) given 
by the Second Isomorphism Theorem in the category of locally compact abelian 
groups, taking M = 1(K) and N = A(S,). Then («) shows that M+ N = A, 
and (**) shows that MM N = 1(R). Hence 


A/i(K) = A(Soo)/t(R). (H) 


Let us write A(Sx.) = Q x A, where Q = R" x C? = [],, archimedean Kv and 
A = [4 nonarchimedean Rv: The mapping ® : K — Q defined near the beginning 
of Section 9 has the property that 


t(R) + ({0} x A) = ®(R) x A. 
From this equality we obtain 
A(So0)/((R) + ({0} x A) = (Q x A)/(®(R) x A) = Q/P(R), 


and Lemma 5.17 shows that this is compact. Since ({O} x A) N(R) = {0}, 
application of the First Isomorphism Theorem and then the Second Isomorphism 
Theorem gives 


(A(Soo)/t(R)) / (A(Soo) /(U(R) + ({0} x AY) = (c(R) + {0} x A)) /e(R) 
= ({0} x A)/(({O} x A) N(R)) 
= {0} x A, 
and this is compact also. So the closed subgroup A(Soo)/(t(R) + ({0} x A) of 
A(Soo)/t(R) and the quotient by this subgroup are both exhibited as compact, and 


it follows that A(Soo)/t(R) is compact. Application of (+) shows that A/t(K) is 
compact. 


A first approach to proving the compactness of I! /:(K *) in Theorem 6.53 is to 
pursue an analogy with the above proof for A/i(K ) by showing that multiplicative 
analogs of (*«) and (**) from that proof are valid here: 


1=(K*)I(So0), 
t(R*) = U(K*~) N1(Soo). 
The second of these formulas is fine and is easily proved: The inclusion 1(R*) C 
U(K *)AT(So9) is clear. For the inclusion .(R*) D t(K *)N1CSo0), let y be amem- 


ber of K* such that c(y) is in I($..). Then |t)(y)|, = 1 for all nonarchimedean 
v, and it follows that y and y~! are in R, hence that y is in R*. 
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The difficulty is that an equality I = t(K *) I(Soo) holds if and only if the ring 
R of algebraic integers in K is a principal ideal domain. Let us elaborate on this 
point, since we will be led by it to the relationship between ideles and the ideal 
class group that makes ideles useful. 

Let us enumerate the nonzero prime ideals of R as P;, Pz, ... in some fashion. 
As was mentioned in Section 2, each nonzero fractional ideal / in K has a finite 


; ae kj 
unique factorization of the form / = P."'.-- Pt 


_ ne where k;,,..., k;,, are integers. 
The mapping that carries J to the tuple (a;);~; with aj = k;, when j = i; and 
aj = 0 when j is not in {k;,,..., k;,,} is a group isomorphism W from the group Z 
of fractional ideals onto a free abelian group Bie , Z of countably infinite rank. 
Some of these fractional ideals are of the form x R for some x € K*, and they are 
the principal fractional ideals. They form a subgroup P of Z that is isomorphic 
to K*, and the quotient Z/P is isomorphic to the ideal class group of K , as was 
shown at the end of Section 2. Theorem 5.19 says that the group Z/P is a finite 
group; its order is the class number of K. 

Meanwhile, suppose that (x,), is a member of the group I of ideles. To 
each nonarchimedean place v, Corollary 6.8 associates a unique nonzero prime 
ideal, which we write as P;(») for a function i(-). If gy = |R/Picy|, then the 
relationship between the valuation ord,(-) and the normalized absolute value 
associated to Piiy) is |xy|, = Gp ord») Since (x,)y is an idele, there are only 
finitely many nonarchimedean v’s for which ord, (x,) is not 0. We can therefore 
map (x,),y into the tuple of integers (ord, (x,))y and compose with Y~! to obtain 
a homomorphism of the group I into the group Z of fractional ideals. In more 
detail, the mapping from I to Bri Z is given by (Xy)y > (aj) j>1 With aia) = 


im 


ord, (x,), and then U—! interprets this sequence of integers as the exponents of 
the appropriate prime ideals. Since any association of members of K ; at finitely 
many nonarchimedean places can be extended to an idele by making the idele 
be 1 at the remaining places, this homomorphism of I into Z is onto TZ. 

Now suppose that the given idele (x,), is of form :(x) for some x in K”. 
Then the procedure for mapping this idele to a product of powers of the nonzero 
prime ideals of R is the same as the procedure for decomposing the fractional 
ideal x R as a product of powers of nonzero prime ideals of R. Consequently our 
homomorphism descends to a homomorphism 


1/(K*) —> Z/P 


of the idele class group I v t(K *) onto the (finite) ideal class group Z/P. This 
is the fundamental fact about the ideles; the displayed homomorphism in effect 
says that the idele class group refines the information in the ideal class group. 
The subject of class field theory shows that this refined information is useful. 
Under the homomorphism of I onto Z, the kernel consists exactly of I(S.), 
the ideles whose components at each nonarchimedean place v are in RX. Thus 
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I/1(Soo) —> Z is an isomorphism. Taking into account the effect on 1(K *), we 
obtain an isomorphism 


I/(«(K*) I(Sx0)) = L/P. 


Returning to our hoped-for equality I = t(K *) ICS.) and comparing with the 
displayed isomorphism, we see that I equals 1(K *) 1(S..) if and only if Z = P. 
Equality Z = P holds if and only if every fractional ideal of K is principal, if and 
only if every ordinary ideal of R is principal. 

Thus we see why a direct analog of the proof of Theorem 6.52 does not work 
for Theorem 6.53. But at the same time we obtain information about how to give 
a correct proof. We saw that factoring I/1(K *) by ICS.) leads to the finite group 
Z/P. We shall see that if we factor I/:(K *) by a suitably larger group I($) with 
S still finite, then the quotient is the trivial group. An indication of this fact was 
in Problems 19-23 at the end of Chapter V, which showed that if we localize R 
at a large enough finite set of nonzero prime ideals, then the result is a principal 
ideal domain. In adelic/idelic terms the corresponding procedure is to enlarge 
Soo to a suitable finite set S containing S$, and to replace I(S..) by IGS); this 
enlargement has the effect of replacing R* by K;* at finitely many places v in 
considering what happens to ideals, and this is exactly what the localization in 
those problems accomplishes. Thus for a suitable finite set S containing Soo, we 
will have an isomorphism 


I/(u(K*) 1(S)) = {1}; 
in other words, 
I=c«(K*)I(S) 


for a suitable finite set $ containing Soo. 

One final remark is needed, and then we are ready to carry out the proof of 
the compactness of I! /i(K*). The remark is that we always have at least one 
archimedean place, and adjusting an idele suitably at one archimedean place 
can change it from being in I to being in the subgroup I! of ideles for which 
[], lxvl, = 1. The members of 1(K%*) are already in this subgroup, but the 
members of I(S) need not be. Thus we replace I(S) by IGS) N I! = 1'(S), and 
the above equality becomes 


I! =1(K*)1'(S) 
for a suitable finite set S. 


PROOF OF COMPACTNESS OF I! /i(K *) IN THEOREM 6.53. Let S be as above. 
Since I! = .(K*) I'(S), the Second Isomorphism Theorem gives 
I /(K*) = 1'(S)/W(K*)1'(S)). (*) 
We shall prove that the right side is compact. 
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Let T be the complement of S.. in S, and define 


Qr= [] Ky, Qo = TL Ke, Ar = TLRS, As = IL RY. 
VESoo veT veT v¢S 
If E is any subset of I(S), E! will denote the set of members of E of total 
absolute value 1. Thus for example, (Qx)! is the set of tuples (x,)yes,, with 
[sess Ixy], = 1. 

Let ® : K* — QF be the mapping given in Section 9. Each member u of the 
group of units R* has the property that |uv|,, = 1 for every nonarchimedean place 
v. Then it follows from the Artin product formula (Theorem 6.51) that ® carries 
R™ into (Q¥)!. One of the two key ingredients in the proof of Theorem 6.51 is 
the observation that 


(Q¥)'/®(R*) is compact. (4x) 


In fact, QF is a product of r; copies of R* and rz copies of C*. The function 
Log : Qi > Rt” given by 


Log(x1, sees Xs Xrjtls- ey Xr tryt1) 


= (log |xi|p,.-., log |x;, |p, log |X 411c) +--+, log X45 1c) 


is a continuous homomorphism of Q{ onto R’'*”, and its kernel is compact, 
being the product of r; two-element groups and rz circles. The image of (2 )! is 
a hyperplane, and the proof of the Dirichlet Unit Theorem (Theorem 5.13) shows 
that Log(Q*)'/Log®(R*) is compact. Then («*) follows. 

The other key ingredient is the finiteness of the class number of K , which was 
proved as Theorem 5.19. Let h be this class number. For each vin T = (Soo)°, let 
P, be the corresponding nonzero prime ideal in R. The ideal P’ in R is principal, 
and we let z, be a generator. This element has the properties that K * /ty (ty)2Ry 
is compact and that |ly(7,)|,, = |v, = 1 for all nonarchimedean v’ with 
vo’ Z v. Let 

2 = I] ly (Ty) Ro; 
veT 
this is a subgroup between A» and Q2 such that Q22/ X42 is compact. Let I be the 
subgroup of K* given by 1 =J],<7 22. 

The group ¢(IT) is certainly a subgroup of 1(K ~*), and the fact that |z,|,, = 1 
for v’ ¢ S implies that .(T) is contained in I'(S). Each member of 1(R*) has 
all nonarchimedean absolute values equal to 1, and consequently we have an 
inclusion (R*)i(T]) © 1(K*)I'(S). In view of (*), I'(S)/((K*) I'(S)) is a 
homomorphic image of 


T'(S)/((R* DAI} x AF x AZ), (H) 
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and it is therefore enough to prove that (+) is compact. 
The members of 1(R*) have all nonarchimedean absolute values equal to 1 
and consequently 


ICR YAN] & AS AG = OUR) Ae AS 

Therefore the quotient of (+) by 

1'(S)/(«((QP)! x AY x AZ) (+) 
is isomorphic to 

I'(S)/ (eT) (®(R*) x AF x Ax) /T(s)/(mcay)! x AAS), 
which in turn is isomorphic to 
(«CT (QT)! x AX x AX))/ (cD) (@(R*) x AF x AZ), 
which is a homomorphic image of 
((Q*) AF BS) [OR YRS xh) S (Q¥)'/®(R*). 

The right side is compact by (**), and therefore it is enough to prove that (+7) is 


compact. 
Let us check that 


(TI) ((Q*)! x AX x AX) = (Q¥ x Le x As)!. (4) 


The inclusion C is immediate. Thus suppose that ((@,) yes... (OvueT, (On) ves) 
lies in the right side of (4). Since (0,)yer7 lies in Yo, there exists an ele- 
ment zo in I such that r, = t,(z0)~!o, lies in R, for all v € T. De- 
fine (w),)ves,, in QF by w, = ly(t19)'w,. For a suitable (5’ vgs, we then 
have L(I00) ((@), ve Su5> (rv)ver, (5), ves) = ((@v)ves,,, (Fv) ver, (Sv) ves), and (+) 
is proved. 

Combining (+) and (+7), we see that it is enough to prove that 


I'(S)/(Q¥ x Ep x As)! (£4) 
is compact. The inclusion of I'(S) into I(S) induces a homomorphism 
I'(S)/(Q¥ x D2 x Az)! > 1(S)/(QF x D2 x Az) (8) 


that is evidently one-one. But it is also onto because if vp is an archimedean 
place and if (x), is given in I(S), then we can adjust (x,,) in such a way that 
the replacement (x,), has absolute value 1. The adjustment is by a member of 
Qy x {1} x {1}, and thus (§) is onto. The right side of (§) is 


(Ql x Qe x A3)/(QE x Le x A3) = Qe2/ Zo, 


and we have arranged that this is compact. Consequently (£4) is compact, and 
the proof is complete. 
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11. Problems 


1. If F is a complete field with a nonarchimedean absolute value and if )°°~ ; ay is 
an infinite series whose terms a, are in F’, prove that the series converges in F if 
and only if lim, a, = 0. 
2. Let the 2-adic absolute value be imposed on Q. Theorem 6.5 shows that Z is 
dense in the subring of Q consisting of all rationals with odd denominator. 
(a) Find a sequence of integers converging in this metric to ; 
(b) Generalize the result of (a) by finding an explicit sequence of integers 
converging in this metric to any given rational ab~!, where a and b are 
nonzero integers with b odd. 


3. For the Dedekind domain R = Z and its field of fractions K = Q, the ring of 
units R* is just {+1}, and the set of archimedean places is just Sgo = {oo}. The 
formula 1(R*) = t(K*) N1(S..) of Section 10 therefore becomes {1(+1)} = 
UQ*)N (R* x II, Zx). 

(a) Verify this formula directly. 

(b) Since Z is a principal ideal domain, the theory of Section 10 and the above 
remarks show that I = 1(Q*) (R* x II, Zx). Prove this formula by an 
explicit construction whose only allowable choice, in view of (a), is a certain 
sign. 

4. Let R be the Dedekind domain Z[./—5 ]. 

(a) Verify for each choice of sign that the ideals (1+ ./—5 , 3) and (1+ /—5, 2) 
are prime and that (1 + /—5,2) = (1 — J/—5, 2). 

(b) Find the prime factorizations of the principal ideals (1 + »/—5) and (3). 

(c) Let P be the prime ideal P = (1+ ./—5 , 3), and let up be the valuation of 
R determined by P. Prove that vp((1 + J-5)/3) =0. 

(d) Lemma 6.3 shows that (1 + /—5) /3 can be written as the quotient of two 
members a and b of R with vp(a) = vp(b) = 0. Find such a choice of a 
and b. 


5. Let v be a discrete valuation of a field F’, let R, be the valuation ring, and let 
P, be the valuation ideal. It was observed after Proposition 6.2 that 1 + P/’ isa 
group under multiplication for any n > 1. Prove forn > | that the multiplicative 
group (1+ P?)/(+ pert) is isomorphic to the additive group P,” / patl under 
the mapping induced by 1 + x x + pr 


6. Derive the finiteness of the class number of a number field K from the compact- 


ness of Th /(K*) given as Theorem 6.53. 


Problems 7—8 compare the topology on the ideles I = Ix of a number field K with 
the topology of the adeles A = Ax. The notation is as in Section 10. 
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7. For each finite set S of places containing the archimedean places, exhibit the 
mappings I(S$) — K, for v € S and 1(S) > R, for v ¢ S as continuous, and 
deduce that the inclusion I — A is continuous. 


8. Let p, be the nih positive prime in Z, and let x, = (%n,y)y be the adele in Ag 
with x,» = pr ifv = p, and x, = lifv ¢ py. The result is a sequence {x,} of 
ideles in Ig. Show that this sequence converges to the idele (1), in the topology 
of the adeles but does not converge in the topology of the ideles. 


Problems 9-10 below assume knowledge from measure theory of elementary prop- 
erties of measures and of the existence—uniqueness theorem for translation-invariant 
measures (Haar measures) on locally compact abelian groups. The continuity in 
Problem 10a requires making estimates of integrals. 


9. Let G be a locally compact abelian topological group with a Haar measure 
written as dx, and let ® be an automorphism of G as a topological group, i.e., an 
automorphism of the group structure that is also a homeomorphism of G. Prove 
that there is a positive constant a(®) such that d(®(x)) = a(®) dx. 


10. Let F be a locally compact topological field, and let F* be the group of nonzero 
elements, the group operation being multiplication. 

(a) Let c be in F*, and define |c| to be the constant a(®) from the previous 
problem when the measure is an additive Haar measure and ® is multipli- 
cation by c. Define |0O|- = 0. Prove that c + |c|r is a continuous function 
from F into [0, +00) such that |cyc2|r = |c1|Fl|colr- 

(b) If dx is a Haar measure for F as an additive locally compact group, prove 
that dx /|x|r is a Haar measure for F™* as a multiplicative locally compact 
group. 

(c) Let F = R be the locally compact field of real numbers. Compute the 
function x +> |x|. Do the same thing for the locally compact field F = C 
of complex numbers. 

(d) Let F = Q, be the locally compact field of p-adic numbers, where p is a 
prime. Compute the function x b> |x| pr. 

(e) For the field F = Q, of p-adic numbers, suppose that the ring Z, of p-adic 
integers has additive Haar measure 1. What is the additive Haar measure of 
the maximal ideal J of Z,? 


Problems 11-14 analyze the structure of complete valued fields whose residue class 
fields are finite, showing that the only kinds are p-adic fields and fields of formal 
Laurent series over a finite field. Let F be a complete valued field with a discrete 
nonarchimedean valuation, let v be the valuation, let R be the valuation ring, and let 
p be the maximal ideal of R. Suppose that the residue class field R/p is finite of order 
q = p” fora prime number p. Theorem 6.26 shows that the topology on F is locally 
compact. The normalized absolute value on F corresponding to v is | - |p; =q7°°?. 
For some purposes it is convenient to separate the equal-characteristic case for F 
and R/p from the unequal-characteristic case. 
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11. Show in the unequal-characteristic case that F has characteristic 0. 


12. (a) In both cases, use Hensel’s Lemma to show that F has a full set of (¢ — 1)* 
roots of unity and that coset representatives in F for R/p can be taken to 
be these elements and 0. Denote this subset of g elements of F by E. The 
subset E is of course closed under multiplication. 

(b) Show in the equal-characteristic case that F is closed under addition and 
subtraction and is therefore a subfield of F isomorphic to F,. 


13. In the equal-characteristic case, write F, for the subfield of F constructed in 

Problem 12b, and let t be a generator of the principal ideal p, so that u(t) = 1. 

(a) Show that each nonzero element of R has a convergent infinite-series ex- 
pansion of the form bar axt* with all a, in IF, and that the value of v on 
such an element is the smallest k > 0 such that a; 4 0. 

(b) Show conversely that every series er a,t* with all ag in F, lies in R, and 
conclude that R = F,[[¢]]. 

(c) Deduce that F is isomorphic to the field F,((¢)) of formal Laurent series 
over IF, , the understanding being that each such series involves only finitely 
many negative powers of t. 


14. Let F be an arbitrary complete valued field in the unequal-characteristic case. 
Since Problem 11 shows F to be of characteristic 0, F contains a subgroup Q’ 
isomorphic as a field to Q. 

(a) Show that the integer g = p” in Q lies in p. 

(b) Deduce that the number vp = v(p) is positive. 

(c) For each nonzero member ab! pe of @ for which a and b are integers 
relatively prime to p, show that v(ab~! p*) = kup. 

(d) Deduce that (Q’, | - |}/“""”) is isomorphic as a valued field to (Q, | - Le 

(ec) Let Q’ be the closure of Q’ in F, and explain why (Q, | - ie ”) is isomorphic 
as a valued field to (Q,, | - |,,)- 

(f) Lett be a generator of p. With E as in Problem 12a, show that each member 
of F has a unique series expansion )°7°_y a,t* with each a, in E and with 
N depending on the element, and show furthermore that every such series 
expansion converges to an element of F. 

(g) Let c1,...,c; with = q™ be an enumeration of the elements ry agt* 
with all a; in E. Show that to each element x in R corresponds some c; such 
that p~!(x — cj) lies in R. Deduce that every element of R is the sum of a 
convergent series of the form )°?°.9 cj, p*. 

(h) Explain how it follows from the previous part that F is a finite-dimensional 
vector space over Q, hence that F is a finite extension of the field Q,. 


Problems 15-19 continue the analysis in Problems 11-14 by examining finite sepa- 
rable extensions of complete valued fields whose residue class fields are finite. The 
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goal is to prove Proposition 6.38 and Lemmas 6.47 and 6.48. Let F be a complete 
valued field with a discrete nonarchimedean valuation, let R be the valuation ring, and 
let p be the maximal ideal of R. Suppose that the residue class field R/p is finite of 
order g = p™ for a prime number p. Let K be a finite separable extension of F’,, put 
n=[K : F],and let T be the integral closure of R in K. Theorem 6.33 shows that 
K isa valued field, that it has a unique nonzero prime ideal P, that the valuation ring 
of K is T, and that the valuation ideal is P. Write f for the dimension of T/P over 
R/p, so that T/P has order gf. Also, write e for the power such that pT = P°. It 
is known from Chapter IX of Basic Algebra that n = ef . In the equal-characteristic 
case, there is an especially transparent argument for proving Proposition 6.38, and 
Problem 15 gives that. Problem 16 gives a less transparent argument that handles 
both cases at once. The remaining problems address Lemmas 6.47 and 6.48. 


15. In the equal-characteristic case, let E be the subset of g elements of F described 
in Problem 12, and let E be the corresponding subset of g/ elements of K. 
Problem 13 shows that F is a field isomorphic to F, and that E is an extension 
field isomorphic to F,r. Let ¢ be a generator in R of p, and let ¢ be a generator 
in T of P. Problem 13 shows that F = F,((t)) and that K = Fys((t )). 

(a) Show that the set L of formal Laurent series in ¢ with coefficients from F,, 
is an intermediate field between F and K, so that L = Fs ((t)). 

(b) Why does it follow that the integral closure of R in L is U = F,;[[t]] and 
that the maximal ideal of U is f = tU? 

(c) Deduce that the residue class field of L is F,+ of order qi and that oT = P*, 
so that the residue class degree of L/F is f and the ramification index of 
K/Lise. 

(d) How can one conclude that L/F is unramified and that K/L is totally 
ramified? 


16. In this problem no distinction is made between the equal-characteristic case and 
the unequal-characteristic case. Let ky and kx be the residue class fields of F and 
K, and write kx = kr (@), where @ is a root of a monic irreducible polynomial 
2(X) in kr[X]. Let g(X) be a monic polynomial in R[X] that reduces modulo 
p to g(X). 

(a) Prove that there exists a € T witha + P = @ and with g(a) = 0. 

(b) With @ as in (a), let L be the intermediate field between F and K given by 
L = F(a), let U be the integral closure of R in L, let 9 be the maximal 
ideal of U, and let k, = U/go. Show that @ lies in U and that the member 
@ of kx is in the image of the natural field map k; > kx. 

(c) Conclude from (b) thatk; = kx. 

(d) By comparing [L : K], the degrees of g(X) and g(X), and the indices e and 
f for K/F and L/F, prove that L has the properties required by Proposition 
6.38. 
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17. This problem applies to both the equal-characteristic case and the unequal- 
characteristic case. Let € be a member of TJ such that K = F(&), and let 
g(X) = xX" + cy X"-!4..-+4¢, be its minimal polynomial over F’. 


18. 


(a) 


(b) 


(c) 


(d) 


(e) 


(f) 


(g) 


Let N = cs Ré*, This is a free R submodule of T of rank n with 
{1,é,...,&"7} as an R basis. Define 


N ={y €K | Trg/(xy) is in R for all x € M}. 


Put x; = &'-! for 1 < i < n. Why is there a unique y; in K with 
Trk/r(xiyj) = 4;;? Show that N is a free R module with {y1,..., yn} 
as R basis. 

If A is a matrix in M,(R) with det A = +1 and if z, = pay Ajxy;, why is 
Deka RZ = Var Rye? 

Let K’ be a splitting field of g(X) over F, and let €),..., &, be the roots of 
g(X) in K’, with €; = &. It is known from Basic Algebra that &), ..., &, are 


distinct. Prove that 

3 g(X) 

say 8 Ei) (X — &) 
by observing that the difference of the two sides is a polynomial in X of 
degree at most n — | and all of &,..., &, are roots. 
Let o; be the field map that fixes F and carries F (€) into K’ in sucha way that 
o;(§) = &;. These mappings have the property that Trx;-(€) = Viel oj(&) 
for all € € K. If h(X) is in the ring K[[X]] of formal power series over K , 
let h7/(X) be the polynomial obtained by applying o; to each coefficient, 
and extend Trx;r : K — F toa mapping of K[[X]] to F[[X]] by letting 
Trkph(X) = Si h°i(X). By making the substitution X t+ 1/X in (c) 
and using the extended trace function just defined, show that 


= = Tre (a) 
tex oo ae eae 


Write the identity in (d) out with power series, equate the coefficients 
of X, X2,..., X" on the two sides, and deduce that Trx/F (&*-19(€)—!) 
equals 0 for 1 < k <n and equals | fork =n. 

Form the n-by-n matrix A with Ajj = Trx/r ((&'~'g/(€)~')(E/“')). The 
result of (e) shows that this matrix has all entries equal to 0 that lie above 
the off-diagonal i + j = n+ 1 and all entries equal to | that lie on the 
off-diagonal. By writing €'t/-? = €”g'+/-@+)-—! and by substituting for 
€” , show that the remaining entries Aj; lie in R. 

Combine the conclusions of (a), (b), and (f) to prove that N= gi(é IN. 


This problem continues with the notation of Problem 17 and assumes in addition 
that K /F is unramified, i.e., that f = n and e = 1. The objective is to prove the 
assertion of Lemma 6.48 that D(K/F) = T. 
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19. 


VI. Reinterpretation with Adeles and Ideles 


(a) Prove that the intermediate field L constructed in Problem 16 is K itself, 
that the polynomial g(X) is the minimal polynomial of a over F’, and that 
K=F(qa). 

(b) Let N = eae, Ra‘. Apply Problem 17 to obtain N= g'(a)!N. Using 
the inclusion N C T, deduce that N D T. and conclude that D(K /F ae Cc 
Co a 

(c) Prove that g’(@) is a unit in T, and deduce that D(K/F) = T. 


This problem continues with the notation of Problem 17 and assumes in addition 

that K/F is totally ramified, i.e., thate = n and f = 1. The objective is to prove 

the assertion of Lemma 6.47 that D(K /F) = P® with e’ equal to e — 1 if p does 

not divide e and with e’ > e if p divides e. Let E be the set of representatives in 

R of the members of R/p as constructed in Problem 12. Since f = 1, the set E 

is also a set of representatives in T of the members of T/P. Let vx and up be the 

respective discrete valuations of K and F’, so that ur = nv k| y by Proposition 

6.34. Let w and A be respective generators of P and p. 

(a) Prove that if M is a field with a discrete valuation w and if x1,..., Xm are 
elements of M with x; +---+2X, = 0 and m > 2, then the number of j’s 
for which w(x;) = miny<j<m w(x;) is at least 2. 

(b) Let g(X) =coX"? + cy X"-!4...+4+, with co = 1 be the field polynomial 
of z over F'. Why are all the coefficients c; in R, and why is vx (c;) divisible 
by n for each j? 

(c) Taking into account that z is a root of its field polynomial and applying 
(a), show that there exist integers i and j withO <i < j <n such that 
j —t = vx(c;) — vx (c;) and that all other integers k with 0 < k <n have 
U« (ce) > 1. 

(d) Using the divisibility conclusion of (b), show that g(X) is an Eisenstein 
polynomial relative to p in the sense that co = 1, that all ofc), ...,c, lie in 
p, and that c, does not lie in . 

(e) Conclude from (d) that g(X) is irreducible over F’, that g(X) is the minimal 
polynomial of z over F,, and that K = F (zr). 

(f) For each k > 0, apply the division algorithm to write k = ni + j with 
0 < j <n=e,and define y, = A‘z/. Show that every member of T has 
a unique convergent series expansion as )~;~y a,x and that all such series 
expansions have sum in T. 

(g) By rewriting the expansion in (f) suitably, show that {1, 7,...,”~!} is an 
R basis for the free R module T. 

(h) By applying Problem 17 with N = = Rx, prove that T = g'(x)'T, 
and deduce that D(K /F) = (g'(zt)). 

(i) Computing g’(z) and applying the valuation v to it, show that uv(g’(7)) = 
e — 1 if v(e) = O and that v(g’(7)) > e if v(e) > 0. Explain how this 
conclusion proves Lemma 6.47. 


CHAPTER VII 


Infinite Field Extensions 


Abstract. This chapter provides algebraic background for directly addressing some simple-sounding 
yet fundamental questions in algebraic geometry. All the questions relate to the set of simultaneous 
zeros of finitely many polynomials in n variables over a field. 

Section | concerns existence of zeros. The main theorem is the Nullstellensatz, which in part 
says that there is always a zero if the finitely many polynomials generate a proper ideal and if the 
underlying field is algebraically closed. 

Section 2 introduces the transcendence degree of a field extension. If L/K is a field extension, 
a subset of L is algebraically independent over K if no nonzero polynomial in finitely many of 
the members of the subset vanishes. A transcendence basis is a maximal subset of algebraically 
independent elements; a transcendence basis exists, and its cardinality is independent of the particular 
basis in question. This cardinality is the transcendence degree of the extension. Then L is algebraic 
over the subfield generated by a transcendence basis. Briefly any field extension can be obtained by 
a purely transcendental extension followed by an algebraic extension. The dimension of the set of 
common zeros of a prime ideal of polynomials over an algebraically closed field is defined to be the 
transcendence degree of the field of fractions of the quotient of the polynomial ring by the ideal. 

Section 3 elaborates on the notion of separability of field extensions in characteristic p. Every 
algebraic extension L/K can be obtained by a separable extension followed by an extension that is 
purely inseparable in the sense that every element x of L has a power xP* for some integer e > 0 
with x?° separable over K. 

Section 4 introduces the Krull dimension of a commutative ring with identity. This number is 
one more than the maximum number of ideals occurring in a strictly increasing chain of prime ideals 
in the ring. For K[X1,..., Xn] when K is a field, the Krull dimension in n. If P is a prime ideal in 
K[X,,..., X,], then the Krull dimension of the integral domain R = K[X,,..., X,]/P matches 
the transcendence degree over K of the field of fractions of R. Thus Krull dimension extends the 
notion of dimension that was defined in Section 2. 

Section 5 concerns nonsingular and singular points of the set of common zeros of a prime ideal 
of polynomials in n variables over an algebraically closed field. According to Zariski’s Theorem, 
nonsingularity of a point may be defined in either of two equivalent ways— in terms of the rank of a 
Jacobian matrix obtained from generators of the ideal, or in terms of the dimension of the quotient of 
the maximal ideal at the point in question factored by the square of this ideal. The point is nonsingular 
if the rank of the Jacobian matrix is n minus the dimension of the zero locus, or equivalently if the 
dimension of the quotient of the maximal ideal by its square equals the dimension of the zero locus. 
Nonsingular points always exist. 

Section 6 extends Galois theory to certain infinite field extensions. In the algebraic case inverse 
limit topologies are imposed on Galois groups, and the generalization of the Fundamental Theorem 
of Galois Theory to an arbitrary separable normal extension L/K gives a one-one correspondence 
between the fields F with K C F C L and the closed subgroups of Gal(L/K). 
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1. Nullstellensatz 


Algebraic geometry studies the geometric properties of sets defined by algebraic 
equations. In the simplest case some field K is specified, the equations are 
polynomial equations in several variables with coefficients in K , and one seeks 
solutions to the system of equations with the variables taking values in K or some 
larger field. 

The nature of the subject is that even fairly simple-sounding geometric ques- 
tions require algebraic background beyond what is in Basic Algebra and the 
first six chapters of the present book. This chapter addresses the necessary 
background, largely from the theory of fields, for addressing fundamental ques- 
tions concerning existence of solutions, the dimension of the space of solutions, 
singularity of the solution set at a particular point, and effects of changing fields. 

The present section supplies background for the question of existence. We 
have a system of polynomial equations in n variables with coefficients in K , and 
we are interested in simultaneous solutions in a given extension field L of K. A 
solution can be regarded as acolumn vector in L”. Think of the equations as of the 
form F;(X,,..., Xn) = 0 with each F; a polynomial, and then the set of solutions 
is the locus of common zeros of the F;’s in L”. The locus of common zeros is 
unaffected by enlarging the system of equations by allowing all equations of the 
form )°; G; F; = 0 with each G; is arbitrary in K[X1,..., X,]; thus we may as 
well regard the left sides as all members of some ideal J in K[X,,..., X;]. The 
Hilbert Basis Theorem says that any idealin K[X1,..., X»] 1s finitely generated, 
and hence studying the common zero locus for an ideal is always the same as 
studying the common zero locus for a finite set of polynomials. 

A proper ideal need not have a nonempty locus of common zeros. For example, 
if K = R, then the single equation X? + Y* + 1 = 0 has no solutions in R?. 
Hilbert’s Nullstellensatz! is partly the affirmative statement that any proper ideal 
has a nonzero locus of common zeros under the additional assumption that K is 
algebraically closed. 


Theorem 7.1 (Nullstellensatz). Let K bea field, let K be an algebraic closure, 


and let n be a positive integer. Then every maximal ideal J of K[X,,..., Xn] 
has the property that K[X),..., X,]/J is a finite algebraic extension of K , and 
in particular the maximal ideals of K[X1,..., Xn] are of the form 


(X1 — a,..., Xn — Gn), 


where (a), ..., @,) is an arbitrary member of K”. Consequently if J is any proper 
ideal in K[X,,..., X,], then 


(a) the locus of common zeros of J in K” is nonempty, 


'German for “zero-locus theorem.” 
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(b) any de in K[X,,..., X,,] that vanishes on the locus of common zeros of 
I in K” has the property that f* is in J for some integer k > 0. 


Before coming to the proof, we mention an important corollary. 


Corollary 7.2. Let K be a field, let K be an algebraic closure, let n be a 
positive integer, and let J be a prime ideal in K[X\,..., X,]. Then J contains 
every polynomial in K[X,,..., X»] that vanishes on the locus of common zeros 
of Jin K[X),..., Xy]. 


PROOF. If f is a member of K[X,,..., X,] that vanishes on the locus of 
common zeros of J, then (b) in the theorem shows that f* is in J for some k. 
Since J is prime, one of the factors of f* = f --- f lies in J. 


EXAMPLE FOR COROLLARY. Let K = L = C, and let J be the principal ideal in 
C[X, Y] generated by Y? — X(X + 1)(X — 1). Consider C[X, Y] as isomorphic 
to C[X][Y]. As a polynomial in Y over C[X], p(X, Y) = Y* -—X(X4+ 1)(X—-1) 
is irreducible because X (X + 1)(X — 1) is not the square of a polynomial in X. 
Since C[X, Y] is a unique factorization domain, p(X, Y) is prime. Therefore J = 
(p(X, Y)) is a prime ideal. The corollary says that every polynomial vanishing 
on the locus of points (x, y) € C? for which y* = x(x + 1)(x — 1) is the product 
of Y? — X(X + 1)(X — 1) anda polynomial in (X, Y). Consequently the ring 
of restrictions of polynomials to the locus for which y* = x(x + 1)(x — 1) is 
isomorphic to C[X, Y]/(Y? — X(X + 1)(X — 1). 


Theorem 7.1b has a tidy formulation in terms of the “radical” of an ideal. If 
R is a commutative ring with identity and J is an ideal in R, then the radical of 
I, denoted by /1 , is the set of all r in R such that r* isin J for some k > 1. It is 
immediate that the radical of / is an ideal containing J and that 7 is proper if J 
is proper. If J is an ideal in K[X,..., X,] andif f isin JT , then f* isin I for 
some k > 0, and hence f vanishes on the locus of common zeros of J. Theorem 
7.1b says conversely that any f vanishing on the locus of common zeros of J has 
f* in I for some k > 0. This means that f is in 7. We can therefore rewrite 
(b) in the theorem as follows: 


(b’) the ideal of all f in K[X,,..., X,] that vanish on the locus of common 
zeros of I in K" is exactly V7. 


The proof of Theorem 7.1 will follow comparatively easily from the following 
two lemmas. 


Lemma 7.3. If K is a field and L is an extension field that is generated as a 
K algebra by n elements x|,...,x,,i¢.,if L = K[x,,...,x,], then every x; is 
algebraic over K. 
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REMARKS. Conversely if x1,..., xX, are elements of an extension field L that 
are algebraic over K, then K (x1,...,Xn) = K[x1,...,Xn]. The reason is that 


K(x, ..-5%n) = K(%1,..., Xn-1) Qn) = K(X, .--, Xn—-1) [Xn] 
= K(x, sey Xn—2) (Xn—1) [Xn] = K(x, sey Xn—2) [Xn-11 Xn] 
=) = K[x1]- + [Xn-i) len) = KI, ..., Xn). 


PROOF. We proceed by induction on n. Forn = 1, if L = K[x1], then we 
know from the elementary theory of fields that x; is algebraic over K. 


For the inductive step, suppose that L = K[x,,...,x,]. Since L is a field, 
K(x) © L, and hence L = K(x1)[x2,...,X,]. By the inductive hypothesis 
applied to L and K (x;), the elements x2,...,x, are algebraic over K (x;). To 


complete the proof, it is enough to show that x, is algebraic over K. 
Fix j > 2. The element x;, being algebraic over K (x_), satisfies a polynomial 
equation 
X”™ 4am) X™ 1 4---+a) =0 


witha,,_1,..., a9 in K (x,). Clearing fractions, we see that x; satisfies an equation 
Diy X™ + Dy Xb be bp =0 


with by, ..., bo in K [x1] and by, 4 0. Multiplying through by peo shows that 
x; Satisfies 


(OmX)" + Bm—1OmX)" | + +++ + Bom)" | = 0, 


and we see that 5,,x; is integral over the ring K[x,]. Let us write c; for the 
element b,, € K[x;] that we have just produced for this /. 

In the case of 7 = 1, we can use m = 1| and dy = —x in the above argument, 
and we are then led toc; = x1. iieey . xin is any monomial in K[x1,..., X,] and 
if / is defined as / = max(/,,...,/,), then the fact that the integral elements over 
K [x1] form a ring implies that (c; --- y)ixt! .- xin is integral over K [x;]. Hence 
for any f in K[x,,...,Xn], (c1-++cn)'f is integral over K[x,] for a suitable 
integer / = [(f). Since K(x,) C K[x,,...,X,], this conclusion applies in 
particular to any member f of K (x). 

The ring K [xj] is a principal ideal domain and is therefore integrally closed 
in its field of fractions K(x,). For f in K (x;), we have seen that (c, -- Cn) f 
is integral over K [x] for some / = /(f). The element (c, - -- Cn)! f isin K (x1), 
and the integral-closure property therefore implies that (c; --- cy)! f is in K [x1]. 

Consequently there exists a fixed element of K [x;] such that every element f 
of K (x1) is of the form g/h! for some g in K [x;] and some integer! > 0. We apply 
this observation to f = q(x,)7! for each irreducible polynomial g(X) in KX], 
and we obtain g(x;)g = h! with g and / depending on q(X). If x, is transcen- 
dental over K , this equality implies the polynomial identity g(X)g(X) = h(X)!. 
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Consequently every irreducible polynomial g(X) divides h(X). If K is infinite, 
this is a contradiction because there are infinitely many distinct polynomials X —a 
in K[X]; if K is finite, this is a contradiction because there exists at least one 
irreducible polynomial of each degree > 1. We arrive at a contradiction in either 
case, and therefore x, is algebraic over K. This completes the induction and the 
proof. 


Lemma 7.4. Let K be a field, and let L be an algebraic extension of K. If 
I is a proper ideal in K[X,,..., X,], then 7L[X,,..., X,] is a proper ideal in 
L[X,,..., Xn]. 


REMARK. As usual, the notation J/L[X,,..., X,] refers to the set of sums of 
products of elements of J and elements of L[X,,..., Xz]. 
PROOF. First let us identify the integral closure of K[X1,..., X;] in the field 


L(X,,..., Xn) as L[X1,..., X,]. The ring L[X,,..., X,] is a unique factor- 
ization domain, and Proposition 8.41 of Basic Algebra shows that it is integrally 
closed. Consequently the integral closure of K[X,,..., X,]in L(X1,..., Xn) is 


contained in L[X,,..., X,]. On the other hand, the integral closure of 
K[X,,..., Xn] in L(X1,..., X,) contains L because L/K is algebraic, and 
it contains each X;. Therefore it contains L[X),...,X,] and must equal 
L[X1,..., Xn]. 


Now we apply Proposition 8.53 of Basic Algebra to the ring K[X,..., Xp], 
its field of fractions K(X,,...,X,), the extension field L(X),...,X,), and 
the integral closure L[X1,..., Xn] of K[Xi,..., Xn] in L(X1,..., Xn). The 
proposition says that if P is any maximal ideal of K[X1,..., Xn], then the ideal 
PL[X1,..., Xn] is proper in L[X1,..., X,]. This result is to be applied to any 
maximal ideal P of K[X,,..., X,] that contains /. 


PROOF OF THEOREM 7.1. Let J be a maximal ideal in K[X,,..., X,]. Then 
L=K[X,,...,Xp]/J isa field. Hence L = K[x1,..., X,] is a field if the x;’s 
are defined by x; = X; + J. Lemma 7.3 shows that each x; is algebraic over K, 
and the first conclusion of the theorem follows. 

When this conclusion is applied to K instead of K, then the fact that K is 
algebraically closed implies that each x; lies in the cosets determined by K ,i.e., the 
cosets of the constant polynomials. Consequently for each j, there is an element 


a; in K such that x; — a; lies in J. Then it follows that (X; — a,,..., Xn — Gn) 
is contained in J. Since the ideal (X; — aj,..., X, — dy) is maximal, J = 
(X, — aj,...,Xy — a,). This proves that the maximal ideals are as in the 


displayed expression in the theorem. 
To prove (a), we apply Lemma 7.4 to the ideal J in K[X,,..., X,,] and to the al- 
gebraic extension K of K. The lemma produces a proper ideal of K[X1,..., Xp] 
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containing J, and we extend it to a maximal ideal J of K[X1,..., Xn]. From the 
previous paragraph of the proof, J is of the form J = (X; —a1,..., Xn —q@) for 
some (a\,...,@,) in K". The ideal J is therefore identified as the kernel of the 
evaluation homomorphism of K[X,,..., X;,] at the point (a;,...,a,). Every 
member of J thus vanishes at (a;,...,a,), and the same thing is true of every 
member of J. This proves (a). 

For (b), let J be a proper ideal in K[X1,..., Xn], and let f be as in (b). Intro- 
duce an additional indeterminate Y, and let J be the ideal in K[X1,..., Xn, Y] 
generated by J and fY — 1. If some point (11,..., Xn, y) lies on the locus of 
common zeros of J in K"*!, then (x1, ..., X,) lies on the locus of common zeros 
of J in K", since J C J; thus f(x1,...,xX,) = 0, since f is assumed to vanish 
on all common zeros of J in K”. Consequently f(x1,...,%)y -1=—140, 
and we find that f(X,,..., X,)¥ — 1 does not vanish on the locus of common 
zeros of J in K”*!, contradiction. We conclude that no point (x1, ...,Xn, y) lies 
on the locus of common zeros of J in K"*!. By (a), we see that 


J=K[X,..., X,Y]. (*) 
Let us write X for the expression X1,..., X,. Then () implies that 
1= Di) pil, Y)gi(X) + q(X, Y)(f(O)Y — I) (+) 


i=1 


for some gj,..., g, in J and some pj,..., p, and g in K[X, Y]. Let w be the 
substitution homomorphism of K[X, Y] into K (X) that carries K into itself, X 
into itself, and Y into f(X)~!. Application of y to (**) gives 


p3 Dp (X, FO) 8: (X), (i) 


since v( f(X)Y - 1) = 0. If Y* is the largest power of Y that appears in any of 
the polynomials p;(X, Y), then we can rewrite (+) as 


F(X) = y (F(X)! pAX, FOO) gi(X) 


and exhibit f (X)* as the sum of products of the members g; of J by members of 
K[X]. Thus f (X)* is in /, and (b) is proved. 


2. Transcendence Degree 


Let K be a field, and let L be an extension field. The algebraic construction in 
this section will show that L can be obtained from K in two steps, by a “purely 
transcendental” extension followed by an algebraic extension. The number of 
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indeterminates in the first step (or the cardinality if the number is infinite) will be 
seen to be an invariant of the construction and will be called the “transcendence 
degree” of L/K. 

Before coming to the details, let us mention what transcendence degree will 
mean geometrically. Suppose that the field K is algebraically closed, suppose 
that J is a prime ideal in K[X,,..., X,], and suppose that V is the locus of 
common zeros of J. Corollary 7.2 shows that J is the set of all polynomials 
vanishing on V, and thus the integral domain K[X1,..., X»]/J may be regarded 
as the set of all restrictions to V of polynomials. If L is the field of fractions of 
K[X,..., Xn]/J, then the transcendence degree of L/K will be interpreted as 
the “number of independent variables” or “dimension” of the locus V . 

Now we can make the precise definitions. Let K be a field, and let L be 
an extension field. A finite subset x;,...,x, of L is said to be algebraically 
independent over K if the ring homomorphism K[X,,..., X,] — L given by 
f > f(x1,..., Xp) is one-one.” Otherwise it is algebraically dependent. 


EXAMPLE. Let K = C, and let p(X, Y) = Y* — X(X + 1)(X — 1). The 
principal ideal J = (p(X, Y)) was shown to be prime in CLX, Y] in the example 
with Corollary 7.2. Therefore C[X, Y]/J is an integral domain. Let x and y be 
the cosets x = X + J and y = Y + J. If L denotes the field of fractions of 
C[X, Y]/J, then we may regard x and y as members of L. The subset {x, y} of L 
is algebraically dependent because the polynomial p(X, Y) maps to 0 under the 
substitution homomorphism of C[X, Y] into L with X hH x and YR y. 


A subset S of L is called a transcendence set over K if each finite subset of 
S is algebraically independent over K. A maximal transcendence set over K is 
called a transcendence basis of L over K. For each transcendence set S of L 
over K , we write K (S) for the smallest subfield of L containing K and S. If some 
transcendence basis S has the property that K(S) = L, then L is said to be a 
purely transcendental extension of K; in this case it follows from the definitions 
that S is a transcendence basis of L over K. 


EXAMPLE, CONTINUED. With K and L as inthe example above, the sets § = {x} 
and S = {y} are transcendence sets over K = C. It is not hard to see that {x} isa 
transcendence basis of L over K . Actually, if z is any member of L that is notin C, 
then {z} is a transcendence set over C. The reason is that C is algebraically closed; 
hence either z is transcendental over C or else z lies in C. Lemma 7.6 below shows 
that any transcendence set of L over C can be extended to a transcendence basis, 
and Theorem 7.9 shows that all transcendence bases of L over C have the same 
cardinality. It follows that if z is any member of L that is not in C, then {z} is a 


By convention the empty set is algebraically independent over K. 
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transcendence basis of L over C and that every transcendence basis of L over C 
is of this form. The two-element set {x, y} cannot be a transcendence set by this 
reasoning, but we can see this conclusion more directly just by observing that 
{x, y} was shown in the example above to be algebraically dependent. 


Shortly we shall establish the existence of transcendence bases in general. If 
S is a transcendence basis and if K’ is defined to be K (S), then we shall show 
that L is algebraic over K’. The subfield K’ of L depends on the choice of S, but 
there is a uniqueness theorem: the cardinality of a transcendence basis of L/K 
is independent of the particular transcendence basis. 


Lemma 7.5. Let L/K be a field extension, let S be a transcendence set of 
L over K, let K(S) be the subfield of L generated by K and S, and let x be an 
element of L not in S. Then S’ = S U {x} is a transcendence set of L over K if 
and only if x is transcendental over K (S). 


PROOF. Suppose that x is transcendental over K (S) and is not in S. Let n 
distinct elements x1,...,%, of S’ be given. If these are all in S, then f te 
f (41, .-.+, Xn) is one-one because S is a transcendence set. Suppose that one of 
the n elements is x; say x, = x. If f is in the kernel of the homomorphism 
fro f@1,.--.Xn),1e., if fy, ...,X,) = 0, then x is a root of the polynomial 
g(X) = f(yq,...,Xn,-1,X) in K(x1,...,Xn—1)[X]. Since x is assumed to 
be transcendental over K(S), the polynomial g must be 0. If we expand the 
polynomial f in powers of X as 


St (X41, ...,Xn—1,X) =c(X1, so keine aes +co(X1, oe Xn=1)s 


the condition that g be O says that cj(x1,..., X,-1) = 0 for all j. Since the set 
{x1,..-, X»—1} is algebraically independent, we see thatc; = 0. Therefore f = 0. 
Hence {x1,...,X,} is algebraically independent, and S’ is a transcendence set. 

Conversely suppose that S’ is a transcendence set of L over K. We are to 
show that the only polynomial F(X) in K (S)[X] such that F(x) = 0 is the 0 
polynomial. Since only finitely many coefficients of F are in question, we may 
view F asin K ({x1,..., Xn})[_X] forsome finite subset {x1,..., xn} of S. Clearing 
fractions, we can write F as 


FX) = dy is¥n). Cr Otipee 5 Ra) Xe co Gein Xn) 
for suitable polynomials d, co, ...,c; in K[X1,..., X,] withd(xy, ...,X,) #0. 
Define 

FM, coy Xas 3) S GO RR! PO Rive Kp 
The condition F(x) = 0 yields F(x, -++>Xn,X) = 0. Since {xj,..-, Xn, x} 
is by assumption algebraically independent over K , we see that F = 0. Thus 
cj(X1,..., Xn) = O for all j, and consequently cj(x1,..., Xn) = O for all j. 
Therefore F = 0, as required. 
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Lemma 7.6. If L/K is a field extension, then 


(a) any transcendence set of L over K can be extended to a transcendence 
basis of L over K, 

(b) any subset of L that generates L as a field over K has a subset that is a 
transcendence basis of L over K. 


In particular, there exists a transcendence basis of L over K. 


PROOF. For (a), order by inclusion upward the transcendence sets containing 
the given one. To apply Zorn’s Lemma, we need only show that the union of a 
chain of transcendence sets in L over K is again a transcendence set. Thus let 
finitely many elements of the union of the sets in the chain be given. Since the sets 
in the chain are nested, all these elements lie in one member of the chain. Hence 
they are algebraically independent over K , and it follows from the definition that 
the union of the sets in the chain is a transcendence set. By Zorn’s Lemma there 
exists a maximal transcendence set, and this is a transcendence basis by definition. 

For (b), we argue in the same way as for (a). Let the given generating set 
be G. Order by inclusion upward the transcendence sets that are subsets of G. 
The empty set is such a transcendence set. As with (a), the union of a chain of 
transcendence sets in L over K is again a transcendence set, and the union is 
contained in G if each individual set is. By Zorn’s Lemma there exists a maximal 
transcendence subset S of G. To complete the proof, it is enough to show that 
every member of G is algebraic over K (S). Let x be in G. We may assume that 
x is not in S. By maximality, S U {x} is not a transcendence set. Then Lemma 
7.5 shows that x is algebraic over K(S). Hence S is the required transcendence 
basis. 

For the final conclusion we apply (a) to the empty set, which is a transcendence 
set of L over K. 


Theorem 7.7. If L/K is a field extension, then there exists an intermediate 
field K’ such that K’/K is purely transcendental and L/K’ is algebraic. 


PROOF. Lemma 7.6 produces a transcendence basis S for L/K. Define K’ 
to be the intermediate field K(S) generated by K and S. Then K’ is purely 
transcendental over K by definition. If x is a member of L that is not in K’, then 
S U {x} is not a transcendence set of L over K by maximality of S$, and Lemma 
7.5 shows that x is algebraic over K (S$) = K’. Hence L is algebraic over K’. 


As was mentioned earlier in the section, the intermediate field K’ with the 
properties stated in the theorem is not unique. In the example above with K = C 
and with L equal to the field of fractions of C[X, Y]/(Y? — X(X + 1I)(X — 1), 
K’ can be any subfield C(z) with z not in the subfield C. For an even simpler 
example, let K be arbitrary, and let L = K(x) be any purely transcendental 
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extension. Use of the transcendence basis {x} of L over K leads to K' = L in 
the proof of Theorem 7.7. But {x7} is another transcendence basis, and for it we 
have K’ = K (x7). The extension L/K’ is algebraic because x is a root of the 
polynomial X? — x? in K (x7)[X]. 

We turn to the matter of showing that any two transcendence bases of L over 
K have the same cardinality. We shall make use of the following result, which 
was proved at the end of the appendix of Basic Algebra: 


Let S and E be nonempty sets with S infinite, and suppose that to 
each element s of S is associated a countable subset E, of E in such 
a way that E = |)... Es. Then card E < card S. 


ses 


In our application of this result, the sets E, will all be finite sets. 


Lemma 7.8 (Exchange Lemma). Let L/K be a field extension. If E is any 
subset of L, let K (E) be the subfield of L generated by K and E, and let K(E) 
be the subfield of all elements in L that are algebraic over K (E). If E U {x} and 
E U{y} are finite transcendence sets of L over K and if x lies in K (E U {y}) but 
not K(E), then y lies in K(E U {x}). 


PROOF. The condition that x lie in K (E U {y}) implies that there exist a finite 
subset {x 1,...,X,} of E and amember f of K (Xj,..., Xn, Y)[Z] such that 


f(,..-;Xn, y, Z) #0 but F X14 009Xny ys Xx) = 0. (*) 


Clearing fractions, we may assume that f lies in K[X1,..., Xn, Y, Z]. Expand 
f in powers of Y as 


l : 
{Cir ALD > ok. ees 
j=0 


Since f(x1,...,%,y,Z) # O by (&), at least one of the coefficients, say 
c;, has to satisfy cj(x1,...,Xn, Z) 4 0. Lemma 7.5 shows that x is tran- 
scendental over K(E), and therefore cj(™1,...,%n,x) 4 O. Consequently 
f(41,.--,%n, Y,x) is nonzero. Since f(x1,...,Xn,y,x) = O by (+), y is 
algebraic over K ({x,,...,X,,xX}). Therefore y lies in K (E U {x}). 


The statement of Lemma 7.8 defines an operation E +» K (E£) on subsets of L. 
Because an algebraic extension of an algebraic extension is algebraic, applying 
this operation a second time does nothing new: K ( K(E )) = K(E). We shall 
make use of this fact in the proof of Theorem 7.9 below. 


Theorem 7.9. If L/K is a field extension, then any two transcendence bases 
of L over K have the same cardinality. 
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REMARKS. The cardinality is called the transcendence degree of L/K. For 
applications to algebraic geometry, the situation of interest is that this cardinality 
is finite, but we give a complete proof of the theorem anyway. 


PROOF. First suppose that L/K has a finite transcendence basis B. Let|B| =n. 
Let B’ be another transcendence basis, and let m = |B B’|. We prove that 
|B’| = |B| by induction downward on m. The base case of the induction is that 
m =n. Then B C B’, and we must have B = B’ by maximality of B. 

For the inductive step, suppose that m < n and that |B’| = |B| whenever 
|B B’'| > m+ 1. We write the elements of B in an order such that B = 
{x1,...,X,} and BN B’ = {x,,..., Xm}. Lemma 7.5 shows that x,,41 is tran- 
scendental over K(B — {Xm+1}). Hence x4; does not lie in K(B — {Xn41}). 
A second application of Lemma 7.5 shows that L = K(B’). The inclusion 
B’ C K(B — {xn41}) is impossible because otherwise we would have 


L = K(B) C K(K(B = {xm4i})) = K(B = (%m41)). 


Hence there exists an element y of B’ that does not lie in K(B — {xm41}). A 
third application of Lemma 7.5 shows that (B — {x»+1}) U {y} is a transcendence 
set for L/K. Since y lies in L = K(B), the Exchange Lemma (Lemma 7.8) 


shows that x41 lies in K ((B — {xXm4i}) U { y}). Consequently B is contained in 


K((B — {Xm+1}) U {y}),and L= K((B — {Xm+1}) U {y}). A fourth application 
of Lemma 7.5 shows that the transcendence set B} = (B — {Xm+1}) U {y} isa 
transcendence basis. The set B; has n elements, and the inclusion B; N B’ D 
{x1,...,;Xm, y} shows that |B, MN B’| => m+ 1. The inductive hypothesis shows 
that |B’| = |B,|, and therefore |B’| = |B|. This completes the proof under the 
assumption that L/K has a finite transcendence basis. 

We may now suppose that L/K has no finite transcendence basis. Let B be a 
transcendence basis of L/K; existence is by Lemma 7.6. To each element x of 
L, we shall associate a canonical finite subset Ey of L. 

Since the element x is algebraic over K (B), use of the field polynomial of x 
over K(B) shows that x is algebraic over K (FE) for some finite subset FE of B. 
Let Ep be such a finite set E with the smallest cardinality; the set Eq will be 
the canonical finite subset E, that we seek. To show that Eo is canonical, we 
show that whenever x lies in K (E) for some finite subset E of B,then Eg C E. 
Arguing by contradiction, suppose that y is a member of Ep that is not in £, and 
define EF; = Eo — {y}. By minimality of | Eo|, x does not lie in K (E;). However, 
x does lie in K(E; U {y}). Application of the Exchange Lemma shows that y 
lies in K (EF, U {x}). Since 


K(E, U {x}) C K(E, UK(E)) C K( K(E, VE)) = K(E, VE), 
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y lies in K(E,; UE). Since y is in B but is not in E; U E, Lemma 7.5 shows 
that y is not algebraic over K (E; U E), and we arrive at a contradiction. This 
completes the proof that whenever x lies in K (E) for some finite subset F of B, 
then Ey C E. Hence Ep is canonical. 

For each element x of L, we let E, be the finite subset of B constructed in the 
previous paragraph. Then we have a well-defined map of L to the set of all finite 
subsets of B given by x > E, C B. Now let B’ be a second transcendence basis 
of L/K, and restrict the map from L to B’. Taking S = B’ and E = <p Ex 
in the indented result quoted just before Lemma 7.8, we find that 

card( LU Ex) < card(B’). (*) 


xeB’ 


On the other hand, any x in B’ lies in K (E,) by definition of E,. Hence B’ C 
K ( U.er? Ex); Applying the operation K (- ) to both sides gives 


L= KB). RU (ly By) ) = (ey BE) 


Since LU,» Ex is a subset of B and since a proper subset of B cannot be a 
transcendence basis of L/K , we conclude that 


B= Uses Ey. 
Consequently 
card B = card (U,eg Ex). 
In combination with («), this equality implies that card B < card B’. Reversing 


the roles of B and B’ gives card B’ < card B. Therefore card B = card B’ by the 
Schroeder—Bernstein Theorem.? 


3. Separable and Purely Inseparable Extensions 


Thus far in this book, we have been interested in the detailed structure of algebraic 
field extensions only when they are separable. For applications to algebraic 
geometry, however, algebraic extensions that are not separable arise and even 
play a special role. Thus it is essential to have some understanding of their 
nature. 

Let us review the material on separability in Section IX.6 of Basic Algebra. 
Let K be a field. A polynomial in K[X] is defined to be separable if it splits 
into distinct first-degree factors in its splitting field over K. Let L/K be an 
algebraic extension of fields. An element of L is defined to be separable over 
K if its minimal polynomial over K is separable. Elements of L that fail to be 
separable over K are called inseparable over K . The prototype of an inseparable 


3A proof of the Schroeder—Bernstein Theorem appears in the appendix of Basic Algebra. 
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element is the element a!/? in the extension k(a!/?), where k = F p(@) is a simple 
transcendental extension of the finite field F,,. Corollary 9.31 of Basic Algebra 
shows that the separable elements of L over K form a subfield, and L/K is 
defined to be separable if every every member of L is separable over K. As 
a consequence of Corollary 9.29 of Basic Algebra, we know that a separable 
extension of a separable extension is separable. 

One further tool from Basic Algebra is needed in order to handle the failure of 
separability. This is Proposition 9.27, which says that an irreducible polynomial 
f (X) in K[X] is separable if and only if f’(X) is not the zero polynomial. It is 
immediate that every irreducible polynomial is separable if K has characteristic 0. 
Thus we need discuss only characteristic p in the remainder of this section. 

The consequence of Proposition 9.27 for characteristic p is that an irreducible 
polynomial f (X) fails to be separable over K if and only if the only powers of 
X that appear with nonzero coefficient in f(X) are the powers X*?, i.e., if and 
only if f (X) = g(X?) for some g in K[X]. 

In this case the polynomial g(X) is certainly irreducible in K[X], and we can 
repeat this process. The polynomial g(X) fails to be separable over K[X] if and 
only if g(X) = h(X?) for some h in K[X]. Then f(X) = A(X”). Repeating 
this process as many times as possible, we see that to each irreducible polynomial 
f (X) in K[X] correspond a unique nonnegative integer e and a unique separable 
irreducible polynomial g(X) such that f(X) = g(X?’). We call p° the degree of 
inseparability of f(X) over K. From the definitions an element of an algebraic 
extension of K is inseparable if and only if the degree of inseparability of its 
minimal polynomial over K is greater than 1. 

If L/K is an algebraic field extension, then an element a of L is said to be 
purely inseparable‘ over K if a?’ lies in K for some integer jz > 0. Let us see 
in this case that the minimal polynomial of a over K is of the form X?° — a” 
for some e > 0. 


Proposition 7.10. If K is a field of characteristic p and if aw is a member of K 
such that 2/o is not in K, then XP" — w is irreducible in K[X] for every “> 0. 


ProoF. Let L be a splitting field of X?" —a@ over K. If B is aroot of X”" —a, 
then 6B?" = a, and hence X?" —a@ = X?" — BP" = (X — B)?". 

Let f(X) be a monic irreducible factor of X”" — a in K[X]. Let us see that 
XP —q@= f (X)” for some n. In fact, if the contrary were true, then there 
would be a second monic irreducible factor g(X) of X?" — a in K[X] relatively 
prime to f(X). Then we can write u(X) f(X) + v(X)g(X) = 1 for suitable 


4 Warning: Not every element of L that is purely inseparable over K is inseparable over K . The 
elements of K are counterexamples. Corollary 7.12 below shows that the elements of K are the only 
counterexamples. 
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polynomials u(X) and v(X) in K[X]. As members of L[X], both f(X) and 
g(X) have to be powers of X — £6 by unique factorization, and thus they both 
vanish at 6. Substitution of 6 into uf + vg = | therefore yields a contradiction. 
Hence X?" —a = f(X)". 

Since f(X) has to be (X — B)” for some m, we obtain X?" —a = f(X)" = 
(X — B)’”"”". The integers m and n must divide p”. Thus m = p”, and f(X) = 
(X —B)? = XP’ — BP’. Since f (X) is assumed to be in K[X], B” lies in K. An 
inequality v < js would imply that y = (8?")?""' lies in K; the p™ power of 
y is a, however, and the hypothesis of the proposition says that such an element 
y cannot be in K. We conclude that v = jz, and thus f(X) = X?" —aq. In other 
words, X?" — w is irreducible in K[X]. 


Corollary 7.11. If L/K is an algebraic extension in characteristic p, if a is 
a purely inseparable element of L over K, and if e is the smallest nonnegative 
integer such that w?° lies in K, then the minimal polynomial of a over K is 
XP — @P’, 


PROOF. This is immediate from Proposition 7.10. 


Corollary 7.12. If L/K is an algebraic extension in characteristic p and if a 
is an element of L that is separable and purely inseparable over K, then a lies 
in K. 


PROOF. Since @ is purely inseparable over K, Corollary 7.11 says that the 
minimal polynomial of w over K is X”° —a”" , where e is the smallest nonnegative 
integer such that w?* lies in K. The separability of a says that this polynomial 
is separable. Unless p® = 1, the polynomial has derivative 0 and thus repeated 
roots. Therefore p® = 1 and e = 0, and we conclude that @ lies in K. 


An algebraic field extension L/K in characteristic p is said to be purely 
inseparable if every element of L is purely inseparable over K. Since purely 
inseparable elements a have minimal polynomials of the form X” — w?", the 
degree of a purely inseparable extension has to be a power of p. 


Theorem 7.13. If L/K is an algebraic field extension in characteristic p and 
if Ks is the subfield of all elements of L that are separable over K, then L/K; is 
a purely inseparable extension. 


PROOF. Let a be an element of L, and let f(X) be the minimal polynomial 
of w over K. Then we can write f(X) = g(X”"), where p® is the degree 
of inseparability of f. The polynomial g(X) is irreducible over K, and it is 
separable. Since a?” is a root, a” is a separable element. Therefore a?’ lies in 
K,. By definition of pure inseparability, a is purely inseparable over K,. Since 
a@ is arbitrary in L, L is purely inseparable over K,. 
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Corollary 7.14. Let R be a Dedekind domain, let F be its field of fractions, 
let K be a finite algebraic extension of F’, and let T be the integral closure of R 
in K. Then T is a Dedekind domain. 


REMARKS. This result is quite important. It was used extensively in Chapter VI, 
as was explained in the remarks with Proposition 6.7, and it plays a foundational 
role in the theory of algebraic curves as presented in Chapters IX and X. Theorem 
8.54 of Basic Algebra proved this result under the assumption that K is a finite 
separable extension of F’, and we are now dropping the hypothesis of separability. 
Since K/F is automatically a separable extension in characteristic 0, we may 
assume that the characteristic is not 0. 


PROOF. Theorem 7.13 shows that K can be obtained in two steps from F, 
a separable extension followed by a purely inseparable extension. The integral 
closure of F in the separable extension field is a Dedekind domain D by Theorem 
8.54 of Basic Algebra, and the integral closure of D in K equals T by the 
transitivity of integral closure. Consequently it is enough to prove the corollary 
under the additional hypothesis that K is a purely inseparable extension of F’. 
What needs proof (in view of the statement of Theorem 8.54 of Basic Algebra) 
is that T is Noetherian, i.e., that each ideal of T is finitely generated. 

Let p be the characteristic. Since K/F is finite and purely inseparable, there 
exists some power g = p”™ of p such that the field K% is contained in F; 
specifically, the integer g is to be large enough for the g" power of each element 
of a vector-space basis of K over F to lie in F. We begin by proving that 


T={beK|b%eR}. (*) 


The inclusion € follows, since b € T implies that b7 isin TM F = R. For the 
inclusion >, let b # Obe in K. Corollary 7.11 shows that the minimal polynomial 
of b over F is X?° — b?’ , where e is the smallest integer > O such that b?* lies in 
F. Since K?" C F,e < m. Thus b is a root of a polynomial X?" — a, where 
a = bP” isa member of R. Consequently b is integral over R and must lie in 7. 
This proves (). 

Fix an algebraic closure K,iz of K,andlet H = F 7" denote the inverse image 
of F under the gq power isomorphism of Kajg onto itself. This is a subfield of 
Kajg, and it contains K because K4 C F. Let S C H be the ring of all b in H 
with bY in R. Since x +> x7 is a field isomorphism of H onto F,x > x4 isa 
ring isomorphism of S onto R. Therefore S is a Dedekind domain. It contains T 
by (). 

Let J be a nonzero ideal in T, and form the ideal J = SJ in S generated by 
I. Since S is Dedekind, J is invertible as a fractional ideal of H relative to S. If 
J~—! denotes the inverse, then J~! is a finitely generated S module in H such that 
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J-'J =S. Thus § = J-'J = J“'SI = J“'1. Accordingly, choose finite sets 
{x;} in J~! and {a;} in J such that > x;a; = 1. 

We shall show that {a;} is a set of generators of J as an ideal in T. We 
apply the g" power mapping to )~ x;a; = 1, obtaining )° xa? = 1 with xf in 
H4 = F C K and with a! in S4 = R. Put b; = a? "x4. Then Yxfat = 1 
implies that }* a,b; = 1; here q; is in J and b; is in I1-'K CK. Ifaisin/, then 
>= (bja)a; = a, and it is enough to show that b;a is in T for each i, i.e., to show 
that b;I C T for eachi. 

The q-fold product (x;/) --- (x;/) is contained in § because x; J C JOJ=S. 
Thus bjl = iat I CS. SobI CSO K. Ifs is any element in S/O K, then 
we know that r = s? is a member of R because $7 = R. Hence s is a root of 
X?4—r withr in R. That is, s is integral over R. Since s also isin K, s lies in the 
integral closure of R in K, whichis T. Thus b;/ © T, and the proof is complete. 


A field K is perfect if either it has characteristic 0 or else it has characteristic 
p and the field map x +> x? of K into itself is onto. Examples of perfect fields 
include all finite fields, all algebraically closed fields, and of course all fields of 
characteristic 0. 


Proposition 7.15. A field K is perfect if and only if every algebraic extension 
of K is separable. 


PROOF. We need to consider only the case that K has characteristic p. Suppose 
that x t» x? fails to be onto K. Choose f in K such that X? — 6 has no root 
in K. Proposition 7.10 shows that X? — 6 is irreducible over K. Since this 
polynomial has derivative 0, it is not separable. Thus X” — £ is a polynomial that 
is irreducible but not separable, and adjunction of a root of X? — B to K produces 
an extension L of K that is not separable. 

Conversely suppose that the field map x +> x? of K to itself is onto. Then 
x +» x? is onto K for every e > 0. Let L be an algebraic extension of K, 
and let Ks be the subfield of elements separable over K. If q@ is given in L, 
then Theorem 7.13 shows that there exists a nonnegative integer e such that a” 
is in K,. Let g(X) be the minimal polynomial of a?* over K, and write g(X) = 
X”™ +¢,X™-14..-4+c¢,. Since K is perfect, there exists b; for each j with 


l<j < m such that b?” = Cj. Put f(X) = X"™+b,X"-!4..-+b,,. Then 


Flay” = (ay + bP (aryl 4. + bP = g(a”) =0, 


and therefore f(a@) = 0. Consequently f(X) divides the minimal polynomial of 
a over K, and the fact that w”* lies in K (@) implies that 


[K (a) : K] < deg f(X) = deg g(X) = [K(@”) : K] < [K@): K]. 
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Equality must hold throughout, and therefore K (a) = K (a?"). Since K (a”’) is 
contained in Ks, a lies in K,. Therefore every member of L lies in Ky, and L is 
separable over K. 


A function field in r variables over a field K is a field L that is finitely 
generated over K and has transcendence degree r over K. A transcendence basis 
{x,,...,x,} of such an extension L/K is called a separating transcendence 
basis of L/K if L is a separable algebraic extension of K(x1,...,x;). If the 
function field ZL inr variables over K has a separating transcendence basis, we 
say that L is separably generated over K. 

The two kinds of fields of continual interest in Chapter VI were number fields 
and function fields in one variable over a base field. In the latter case some results 
beginning in Section VI.6 assumed in effect that the function field is separably 
generated over the base field. It was asserted at the beginning of Section VI.9 that 
function fields in one variable over finite fields are always separably generated; 
this assertion is a special case of Theorem 7.20 below. 

Proposition 4.28 of Basic Algebra gave a version of the Factor Theorem valid 
for all commutative rings with identity. For the present investigation we need a 
version of the division algorithm that is valid in this wider context. 


Lemma 7.16. Let R be a commutative ring with identity, let f(X) and g(X) 
be members of R[X] of respective degrees m and n, and let a be the leading 
coefficient of g(X). For the integer k = max(m—n-+ 1, 0), there exist g(X) and 
r(X) in R[X] such that 


ak f (X) = g(X)q(X)+r(X) — withdegr <norr =0. 


PROOF. If m <n, then k = 0, and the displayed formula holds with g(X) = 0 
and r(X) = f(X). Form > n — 1, we proceed by induction on m. The base 
case of the induction is m = n — 1, which we have already handled. For the 
inductive step, suppose that m > n. The integer k ism—n-+1. If bis the leading 
coefficient of f(X), then af (X) — bX”~"g(X) is a polynomial that either is 0 
or has degree less than m. The inductive hypothesis allows us to write 


al™V—"tl (af (X) — bX™ "9(X)) = g(X)qi(X) +11 (X) 


with degr; <n orr; = 0. If we set gq(X) = ba" "X™" +. qi(X) andr(X) = 
r|(X), then we obtain a‘ f (X) = g(X)q(X) +r(X), and the lemma follows. 


Lemma 7.17. Let L/K be a field extension, let x1, ..., Xn, Xn41 be elements 
of L, and suppose that x;,...,x, are algebraically independent over K but 
that x1, ...,Xn,Xn+41 are not algebraically independent. Then the ideal / of all 
polynomials in K[X,,..., X;41] that vanish at (x1, ..., X,41) is principal with a 
generator that is irreducible in K[X,,..., Xn41] and involves X,,4; nontrivially. 
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PROOF. The algebraic dependence implies that 7 contains nonzero polyno- 
mials. Let g(X1,..., Xn, Xn41) be one whose degree in X,+41 is as small as 
possible, say /. Expand g as 


8 =CO(X1,--- Xn)Xngy Hei(X1, ---, Xn) Kg y He He (KX1,.-., Xn). 


The algebraic independence of X,, ..., X, implies that at least one of co, ... , c1-1 
is nonzero. Since K[X1,..., X»] is a unique factorization domain, we can factor 
out and discard the greatest common divisor of the coefficients co, ..., c;. Thus 
we may assume that g is primitive as a polynomial in X,41. If f is any element 
in 7, then Lemma 7.16 applied to the ring K[X1,..., X;] allows us to write 
a‘ f = gq +r withr = 0 or degr < k. Substituting (x1, ..., x,41), we see that 
r isin J. The minimality of / implies that r = 0, and thus a‘ f = gq. Write c(h) 
for the greatest common divisor of the coefficients of a polynomial h. Taking 
the greatest common divisor of the coefficients on each side of a‘ f = gq and 
applying Gauss’s Lemma, we obtain a‘c(f) = c(q). Therefore a* divides q, 
and we obtain f = gqo for some go. Consequently / is principal. If g = gig2, 
then the definition of J shows that at least one of g; and g is in J, say g. The 
minimality of / implies that the degree of g; in X,41 is 1. Therefore gz is in 
K[X,..., Xn]. Since g is primitive, go divides 1. Hence gz lies in K. 


Theorem 7.18 (Mac Lane). If L/K isa field extension that is finitely generated 
and separably generated, then any set of generators contains a subset that is a 
separating transcendence basis of L/K. 


PROOF. Let the characteristic be p. The proof is by induction on the tran- 
scendence degree of the extension. For transcendence degree 0, the required set 
is the empty set, and there is nothing to prove. The main step is transcendence 
degree 1. 

Thus let L = K (x1, ...,X,), and suppose that {z} is a transcendence basis of 
L over K such that L is separable over K (z). Since z is transcendental, z does not 
lie in K (z”). Thus Proposition 7.10 shows that X? — z? is irreducible over K (z?), 
and z is inseparable over K (z?). The field L is algebraic over K (z”), and the 
subset of separable elements over K (z?) is a subfield. Since L = K(x1,...,Xn) 
and since z is amember of L that is not separable over K (z”), it follows that some 
Xj, Say X,, is inseparable over K(z”). It will be proved that {x;} is a separating 
transcendence basis of L over K ,i.e., that x; is transcendental over K and that L 
is separable algebraic over K (x). 

We apply Lemma 7.17 with n = 2 to the elements z,x;. The lemma pro- 
duces an irreducible polynomial f(Z, X) in K[Z, X] such that f(z,x,) = 0. 
Gauss’s Lemma shows that this polynomial remains irreducible when considered 
in K (Z)[X], and we have a ring isomorphism K (Z)[X] = K (z)[X] because z is 
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transcendental over K . Up to anonzero factor from K (z), f(z, X) is the minimal 
polynomial of x; over K(z). Since L is separable over K (z), the element x; is 
separable over K (z), and its minimal polynomial over K (z) involves some power 
of X that is not a power of X?. 

Let us prove that x; is transcendental over K. In the contrary case, let g(X) 
be its minimal polynomial over K. Since g vanishes when X = x, and Z = z, 
g(X) satisfies an identity g(X) = q(Z, X)f(Z, X) in K[Z, X]. It therefore 
satisfies the same identity in K (X)[Z]. Since g(X) is a unit in K (X)[Z], so is 
f(Z, X). Therefore f(Z, X) is independent of Z. Since g(X) is the minimal 
polynomial for x; over K, g(X) = cf(Z, X) for some c in K. Since f(Z, X) 
involves a power of X that is not a power of X?, the same thing is true of g(X), 
and consequently x; is separable over K . Therefore x; is separable over the larger 
field K (z?), in contradiction to the defining condition on x;. We conclude that 
x, 1s transcendental over K. 

Since L has transcendence degree | over K,, it follows that z is algebraic over 
K (x). Let us see that z is separable over K (x,). In fact, Gauss’s Lemma shows 
that f(Z, X) remains irreducible when considered in K (X)[Z], and we have a 
ring isomorphism K (X)[Z] = K (x,)[Z] because x, is transcendental over K. 
Therefore f (Z, x,) is the product of anonzero member of K (x,) and the minimal 
polynomial m(Z) of z over K (x,). If z were inseparable over K (x;), then m(Z) 
would be a polynomial in Z?, and we would have f(Z, X) = h(Z?, X) with 
hin K[Z, X]. We know that f(Z, X) involves some power of X that is not a 
power of X?, and hence the same thing is true of h(Z?, X). Since h(z?, X) is 
irreducible in K [X], x; is separable over K (z”), in contradiction to the defining 
property of x,. Therefore z is separable over K (x,). 

The defining property of z is that all x; are separable over K(z). Since z is 
separable over K (x), all of x2,...,X, are separable over K (x,). Therefore L 
is separable over x;, and {x,} is a separable transcendence basis of L/K. This 
completes the proof of the theorem for transcendence degree 1. 

The inductive step is somewhat a formal consequence of what has just been 
proved. To see this, suppose that the theorem is known for transcendence de- 
grees 1 andr — 1, and let L = K(x,...,x,) have transcendence degree r. 
The assumption is that L has a transcendence basis {z),...,z,} such that L 
is separable over K(z1,...,Z,-). Put K; = K(z1). Then the set {z2,...,z,-} 
is a transcendence basis of L over K, consisting of r — 1 elements, and L is 
separable over K1(Z2,..., Z-) = K(z1,...,Z,) by assumption. By the inductive 
hypothesis for the case of transcendence degree r — 1, some subset of r — 1 
elements from among x),...,x, forms a separating transcendence basis of L 
over K,; let us say that this basis is {x,,...,x;~1}. This implies that L is 
separable over K,(x1,...,%,-1) = K(%1,%1,-.-,%--1). In other words, if 
K' = K(x,,...,%;-1), then L = K'(x,,..., X,) is separable over K’(z1). Since 
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L/K' has transcendence degree 1, {z} is a separating transcendence basis of L/K’. 
By the inductive hypothesis for transcendence degree 1, some x; forr < j <n 
forms a separating transcendence basis of L/K’. For this j, {x1,..., X;—1, xj} iS 
then a separating transcendence basis of L/K. 


Lemma 7.19. Suppose that L is a field extension of transcendence degree r 


over a field K and that L is not separably generated over K. If x1,...,x, are 
elements of L such that L = K(x,,...,X,), then for a suitable relabeling of the 
x;’s, the subfield K (x1, ..., X41) of L is of transcendence degree r and is not 


separably generated over K. 


PROOF. We fix K andr, and we proceed by induction on n. The base case is 
that n = r + 1, and then there is nothing to prove. For the inductive step, suppose 
that the lemma has been proved for n — 1 whenn > r+ 1. We prove the lemma 
for n. Since r < n, we can renumber the x;’s and assume that K (x2, ..., Xn) 
has transcendence degree r over K . If this field is not separably generated over 
K, then we are in a situation with n — 1 elements. The inductive hypothesis is 
applicable, and the lemma follows in this case. 

Thus suppose that K (x2, ..., Xn) is separably generated over K. Theorem7.18 
shows that after a renumbering of the indices, we may assume that {x2, ..., X-41} 
is a separating transcendence basis of K (x2,...,%Xn) over K. This implies that 
K (X2, ..., Xn) is a separable extension of K (x2, ..., X,+1). Since by assumption 
L = K(x,...,X,) is not separably generated over K, K (x1,...,X,) is not 
separable over K (x2, ... , X-41). A separable extension of a separable extension is 
separable, and we deduce that K (x;, ..., X,) is not separable over K (x2, ..., Xn). 
Thus x; is inseparable over K (x2, ..., X,) and is consequently inseparable over 
the subfield K (x2, ..., X-41). Hence K (x1,..., X;+41) is not separably generated 
over K. 


Theorem 7.20 (F. K. Schmidt). If K is a perfect field, then every finitely 
generated field extension of K is separably generated over K . 


REMARK. In particular, the theorem applies if K is a finite field or is alge- 
braically closed or has characteristic 0. 


PROOF. Let K have characteristic p. We induct on the transcendence degree 
of the field extension of K. The base case of the induction is transcendence 
degree O, and then the theorem is handled by Proposition 7.15. For the inductive 
step, assume that the theorem holds for all finitely generated field extensions of 
K having transcendence degree r — 1 over K. Let L = K(x1,...,X,) have 
transcendence degree r over K. Arguing by contradiction, suppose that L is not 
separably generated over K. Lemma 7.19 shows for a suitable renumbering of the 
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x;’s that K’ = K (x1,..., X-41) has transcendence degree r and is not separably 
generated over K. 

We divide matters into two cases. First suppose that the transcendence degree 
of K"” = K(x1,...,x,-) isr — 1. The inductive hypothesis shows that K” is 
separably generated over K , and then Theorem 7.18 shows that we may renumber 
the variables in such a way that {x;, ..., x,_1} is a transcendence basis of K” over 
K and K” is separable algebraic over K (x1, ..., X;-1). Then {x1, ..., X;-1, X41} 
is a transcendence basis of K', and x, is algebraic over K (x1, ...,X;—1, Xr41)- 
Since x, is separable over K (x;,...,X;—1), it is separable over the larger field 
K(x,,...,X;—1,X-41). Therefore K’ is separably generated over K , contradic- 
tion. 

The remaining case is that every subset of r members of {x),...,X,41} isa 
transcendence basis of K’ over K. Lemma 7.17 produces an irreducible polyno- 
mial f in K[Xj,..., X;41] such that f(x, ...,%;-41) = 0. Since {x1,..., x;} 
is a transcendence basis of K’, application of Gauss’s Lemma shows that f is 
irreducible in K(X1,..., X+)[Xr41] = K(@1,...,%,)[X+41]. Hence up to a 
nonzero factor from K, f(x1,...,X-, Xr+1) is the minimal polynomial of x;+1 
over K(x,,...,x,). The failure of K’ to be separably generated over K implies 
that x, is inseparable over K (x,,..., x,), and thus the only powers of X,.+ that 
appear in its minimal polynomial over K (x,,...,x,) are powers X see In other 
words, f is in K[Xj,..., X;, Xe 4: Since we are assuming that any r of the 
elements x), ...,X,;+1 form a transcendence basis of K’ over K, there is nothing 
special about X,+ in this argument. Consequently f isin K[X?,..., X?, X?,)]. 
Since K is perfect, any polynomial involving only p™ powers of each indeter- 
minate is the p™ power of some polynomial. Consequently f is reducible in 
K[X,,..., X41], in contradiction to the irreducibility guaranteed by Lemma 
7.17. All cases thus lead to a contradiction, and the proof is complete. 


4. Krull Dimension 


In this section we develop the algebraic background necessary for a discussion 
of dimension. Suppose that K is an algebraically closed field, suppose that J is 
a prime ideal in K[X1,..., X,], and suppose that V (/) is the locus of common 
zeros of I. Corollary 7.2 shows that J is the set of all polynomials vanishing on 
V(Z), and thus the integral domain R = K[X,..., Xn]/J may be regarded as 
the set of all restrictions to V(/) of polynomials. If L is the field of fractions 
of R, then the transcendence degree of L/K is interpreted as the “number of 
independent variables” on the locus V(/). We define it to be the dimension of 
VC). The elements X; + J of R for 1 < j <n generate R as a K algebra, 
and therefore they generate L over K as a field. We shall make critical use of 
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the fact implied by Lemma 7.6b that some subset of {X; + /,...,Xn+J/} isa 
transcendence basis of L. We shall speak of such a subset as a transcendence 
basis of R for economy of words. We denote its cardinality by tr. deg R. 


EXAMPLE. We continue with the example from Sections 1-2. Let K = C, 
let J be the principal ideal (Y* — X(X + 1)(X — 1) in C[X, Y], and let L 
be the field of fractions of the integral domain R = C[X,Y]/J. Corollary 
7.2 shows that the ring R is the ring of restrictions of polynomials to the locus 
VU) = {(x, y) € C? | y? = x(x +1)(x— 1}. According to the above definition, 
the dimension of V (/) is the transcendence degree of L, which we have seen is 1. 
This is in accord with the intuition that the locus V (/) is a “curve” in the sense 
of having one independent complex parameter. 


The goal of this section is to produce an equivalent definition of dimension 
that does not depend on the fact that K[X,,..., X,]// is an integral domain. 
The rephrased definition will extend to any commutative ring with identity and 
is essential for modern algebraic geometry. 

Let R be any commutative ring with identity. The Krull dimension of R, 
denoted by dim R, is the supremum of the indices d of all strictly increasing 
chains 

Py GP GS Py 


of prime ideals in R. We define dim R = oo if there is no finite supremum. 


EXAMPLES OF KRULL DIMENSION. 


(1) R equal to a field. The only prime ideal is 0. Thus the Krull dimension of 
any field is 0. 


(2) R = Z. The prime ideals are of the form pZ for each prime number p, 
together with 0. Each nonzero prime ideal is maximal. Consequently there is a 
strictly increasing chain 0 S pZ of prime ideals for each prime number p, but 
there are no longer such chains. Thus dim Z = 1. More generally any principal 
ideal domain R that is not a field, or even any Dedekind domain R that is not a 
field, has dim R = 1 because every nonzero prime ideal is maximal. 


(3) R commutative Artinian. In Chapter IT a ring with identity was defined to be 
Artinian if its two-sided ideals satisfy the descending chain condition. Problem 8 
at the end of that chapter showed that every prime ideal in such a ring is maximal. 
In other words, every commutative Artinian ring has Krull dimension 0. 


(4) Polynomial ring R = K[X,,..., X,], where K is a field. In geometric 
terms for the case that K is algebraically closed, the relevant zero locus for this 
R is K", which we certainly want to have dimension equal to n, and the field of 
fractions of R is K (X,,..., X,), which indeed has transcendence degree n. Let 
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us examine the Krull dimension of R. If 0 < k < n and if we form the ideal 
(X1,..., Xx), then the ring isomorphism 


R — K[Xx41, wos Xn IX, wo XK] 
shows that the quotient R/(X1,..., Xx) is isomorphic to K[Xx41,..., Xn], 


which is an integral domain. Therefore (X;,..., X,) is prime, and we have 
a strictly increasing chain 


WC OG) SG Ca ok SOG rn), 


So dim K[X,,..., X;,] => n. Actually, equality holds, as Theorem 7.22 will 
show. 


Lemma 7.21. Let R be a commutative ring with identity, let S~'R be the 
localization relative to a multiplicative system S in R, let J be an ideal in R, and 
let S be the image of S in R/J. Then 


STR/STI=S"(R/D 


via the mapping s'r + STI (s +1) 10 +2). 


Proof. Letg : R > R/I andg : S~'R — S7!R/S™'I be the quotient 
homomorphisms, and let 7 : R — S~'R and 7: R/I — S~'!(R/1) be the 
canonical homomorphisms of R and R/J into their localizations. To each of the 
rings X; = Sook Sr and X, = S7(R/1) is associated a canonical map, 
namely 7; : R — X, and mm: R > X2 with nm = qn and no = nq. Let 
us see that the pairs (X;, n;) fori = 1,2 have the following universal mapping 
property with respect to ring homomorphisms ¢ of R into a commutative ring 
T with identity such that g(1) = 1, gp) = 0, and g(S) C T%: there exists a 
unique homomorphism ¢; : X; — T such that g = @;7;. 

For i = 1, we first apply the universal mapping property of the localization 
S~!R to write g¢ = yn and then apply the universal mapping property of the 
quotient to write g = ¢,\qgn. For i = 2, we first apply the universal mapping 
property of the quotient R/J to write ¢ = @2q and then apply the universal map- 
ping property of the localization to write ¢ = @7q. From these constructions we 
deduce existence and uniqueness of @; in both cases. The asserted isomorphism 
then follows from the general fact that objects satisfying a universal mapping 
property are unique up to isomorphism; tracking down that isomorphism gives 
the explicit formula in the lemma. 
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Theorem 7.22. Let K be a field, let R be an integral domain that is finitely 
generated as a K algebra, and let L be the field of fractions of R. Then the Krull 
dimension of R equals the transcendence degree of L over K. 


PROOF. If x,;,...,X, are generators of R as a K algebra, then R = 
K[X,,...,Xn]/1, where J is the ideal of all polynomials in K[X,,..., Xz] 
that vanish at (x,,...,X,). The ideal J is prime, since R is assumed to be an 
integral domain. Let r be the transcendence degree of L over K. We know from 
Lemma 7.6b that some subset of {x;,...,X,} 1s a transcendence basis of L over 
K; therefore r < n. To prove the theorem, we shall prove that r > dim R and 
thatr < dim R. 

Suppose that P and Q are prime ideals of R with P C Q. Then the identity 
map on R descends to a K algebra homomorphism g : R/P — R/Q. If 
a; = x; + P and Bj; = x; + Q are the images of x; under the respective quotient 
maps R > R/P and R > R/Q,then {a1,..., a@,}is aset of generators of R/P, 
{B1,..., Bn} isa set of generators of R/Q, and y(a;) = f; forl < j <n. Ifr’ = 
tr. deg R/Q, we may assume that {,,..., B,/} is a transcendence basis of R/Q. 
Then {a 1, ..., a} is an algebraically independent subset of R/P over K because 
if f is a nonzero polynomial in K[X,,..., X,'] such that f(a1,...,a,-/) = 0, 
then application of g and use of the fact that g fixes each coefficient of f yields 
t (Bi, .--, Br) = 0; the latter equation contradicts the algebraic independence of 
{Bi,..., By}. We conclude that 


PCQ implies tr.deg(R/P) > tr.deg(R/Q). («) 
To prove the inequality r > dim R, let a chain of prime ideals 
OCHSPS::-SPy 


of R be given. We are to show that r > d. Abbreviate K[X,,..., X,] as A, so 
that R = A/T. Pull the chain of ideals of R back to a chain of ideals in A as 


ITCPSSPLG::-S Pi. (#*) 

Inequality (+) shows that 
tr.deg(A/ Po) = tr.deg(A/P}) > --- > tr.deg(A/P,). (1) 
Since taking Pj = J shows that tr. deg(A//) = tr. deg(R) =r, every member of 
(+) is <r. It will follow from (+) that r > d if we show that each inequality in 


(+) is strict, i.e., that for prime ideals P and Q in A, 


P S Q implies tr.deg(A/P) > tr.deg(A/Q). (+7) 
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Since dim R is the supremum of the integers d as in (**) and (+), proving (+7) 
will prove that r > dim R. 

Thus let P and Q be prime ideals in A = K[X1,..., X,] with P s Q. Put 
a; = X; + P and f; = Xj; + Q, so that the mappings of A to A/P and A/Q 
are f(X1,...,Xn)  f(@,...,Q@,) and f(X1,...,Xn) RH f(Br,..-, Bn). 
Then A/P = K[qy,...,a@,] and A/O = K[f,,..., Bn]. As above, ifr’ = 
tr. deg A/Q, then we may assume that {6),..., B,’} is a transcendence basis of 
A/Q. Arguing by contradiction, we may assume that tr. deg A/P = tr.deg A/Q. 
Then it follows that {a1, ..., a@,’} is a transcendence basis of A/P. We localize A 
with respect to the multiplicative system S consisting of the complement of 0 in 
K[X,..., X;’]. Then StA= K(X%,..., Xp) [Xp41,..., Xn]. To understand 
S~!P, we apply Lemma 7.21 to write 


SAS PSs" Arp), (4) 


where S is the image of S in A/P. The restriction to K[X,,..., X,] of the 
map A — A/P carries f(X1,..., X,) to f(Q1, ..., a) and is one-one because 
{a1,...,@,’} is a transcendence set. Therefore SA P = @, and S > S is 
one-one. Corollary 8.48d of Basic Algebra shows from SM P = @ that S~'P 
is a proper ideal of S~'A. Since S > S is one-one, let us view S as S = 


{f(a,...,a,) | f #0}. Then 
S“'A/P) = K Gass. G)lepyiy a 4Gnl. (£5) 


Since a@;/41,..., @y are algebraic over K (a1, ..., @,") because of the assumption 
tr.deg A/P = tr.deg A/Q =r’, the remark with Lemma 7.3 shows that (£4) is 
a field. By (=), S~!P is a maximal ideal. Arguing similarly with Q, we see that 
SQ = @ and that $~'Q is a maximal ideal. From P C Q, we have S~!P C 
S-!Q. Because S~!P and S~!Q are maximal, S~! P = S~!Q. Therefore Q € 
S~!P. Since Q properly contains P , we can choose g in Q that is notin P. This g 
isanelementof K[X,,..., X,]suchthat g(a,...,a@,) 4 Oand g(f),..., By) = 
0. From the inclusion Q C S~!P, there exist an f in P and a nonzero s in 
K[X,,...,X,] with g = s7'f. Then f = sg. Since f(a1,...,@,) = 0 and 
S(Q1,...,Q,")g(@1,...,Q_,) # 0, we obtain a contradiction. This contradiction 
proves (+7) and shows that r > dim R. 

The argument that r < dim R will proceed by induction onr. If r = 0, then 
R= K[x1,...,Xn] is a field by the remark with Lemma 7.3, and dim R = 0 by 
Example 1 of Krull dimension. Now suppose inductively that r > 0 and that the 
inequality is known when tr.deg R < r. Put A= K[Xj,..., X,], and suppose 
that R = A/I = K[x,,...,X,] with x; transcendental over K. We localize A 
with respect to the multiplicative system S$ consisting of the complement of 0 
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in K[X,]. Then S~!'A = K(X)[X2,..., Xn]. To understand S~'J, we apply 
Lemma 7.21 to write z 
STA/STLI=S—“(A/D, 


where S is the image of S in A/J. Arguing as in the previous paragraph, we see 
that x 
S(A/D = K(x1)[x2,.--, Xn]. 


Combining these two isomorphisms, we see that S~!A/S~'J has transcendence 
degree r — 1 over K(x). By the inductive hypothesis, S~'A/S~'J has Krull 
dimension > r — 1. Thus there exists a strictly increasing chain 


S"1=Q50¢---SO,-1 


of prime ideals in S-'A. If we put P; = AN Q; for each, then each P; is prime 
in A. From the theory of localization, we know that Q; is recovered from P; by 
O; = S~'P;, and thus we have a strictly increasing chain 


T=PySPiG-:-SP-1 (§) 


of prime ideals in A. The fact that P,_; is proper implies that SN P,_; = @. That 
is, no nonzero member of K[X,] lies in P,_;. Consequently the image of X, 
in A/P,_, 1s transcendental over K. The Nullstellensatz (Theorem 7.1) shows 
that P._; is not maximal in A. Hence the chain (§) can be extended by a strict 
inclusion in a maximal ideal P,, andr < dim A//J = dim R. This completes the 
induction and the proof. 


5. Nonsingular and Singular Points 


In this section we develop the initial algebraic background necessary for a dis- 
cussion of nonsingular and singular points. Unlike what happened in previous 
sections, we shall not try to separate completely the algebra from the geometric 
setting, because the points to be investigated are the actual points of a zero locus. 

The motivation comes from the Implicit Function Theorem in the calculus of 
several variables. In that setting, suppose that we have / numerical-valued smooth 
functions fi,..., f; of m variables. Let k be an integer with 1 < k <n, and ab- 
breviate (x1,..., Xp) aS (x, y), where x = (%1,..., x4) and y = (X%441,..-,Xn)- 
Suppose that (xo, yo) has the property that fj (xo, yo) = Ofor1 <i </. The hope 
is that there is a smooth vector-valued function y = g(x) defined near x = xo 
such that yo = g(xq) and such that f;(x, y) = 0 for 1 <i </ with (x, y) near 
(xo, yo) if and only if y = g(x), 1e., that the locus of common zeros of f,..., ff 
is locally the graph of a smooth function of k variables. According to the Implicit 
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Function Theorem, a sufficient condition for this to happen is that k + / = n and 
that the (square) matrix of the first partial derivatives at (xo, yo) of the f;’s for 
1 <i </ with respect to the y;’s fork + 1 < j <n be invertible. A little more 
generally but still with k +/ =n, the locus of common zeros is locally the graph 
of a smooth function of / of the variables in terms of the remaining k variables if 
the matrix of all the first partial derivatives of the f;’s has the maximum possible 
rank, namely /. 

Let us describe the setting for a comparable situation in algebraic geome- 
try. Throughout this section we assume that K is an algebraically closed field. 
Suppose that / is a prime ideal in K[X1,..., X;,], and let V(/) be the locus of 
common zeros? of J in K”. The Hilbert Basis Theorem shows that J is finitely 
generated over K as an ideal, and we let {f|,..., fi} be a set of generators. 
Corollary 7.2 shows that / is the set of all polynomials vanishing on V (J), and 
thus the integral domain R = K[X,,..., X,]// may be regarded as the set of all 
restrictions to V (J) of polynomials in the following sense: if x = (x1, ..., Xn) is 
a member of V(/) and f(X,,..., X,) isin K[X,,..., X,], then every member 
of the coset f + / has the same value at x, and it is consequently meaningful to 
write f(x) for f in R. 

From Theorem 7.22 the transcendence degree over K of the field of fractions 
of R equals the Krull dimension of the ring R, and these numbers are what is 
taken as the dimension of V (J) over K. We write dim V (/) for this dimension. 
In this setting, a point x of V (Z) is called a nonsingular point, or regular point, 
if the matrix [ax (x)] has rank equal to n — dim V (/). Otherwise x is a singular 
point. 

It is important to observe that these definitions do not depend on the choice of 


the set {f1,..., ft} of generators of J. In fact, it is enough to show that the row 
space of the matrix [22 @)] is exactly the space of all row vectors 
wy 
af af ) 
— Sas ve fe eT, 
(Gx, © or f 


since the latter space is manifestly independent of the choice of generators. To see 
that the displayed space equals the row space of the matrix whose rank appears 
in the definition of singular point, let g),..., g, be arbitrary polynomials. Then 
f = >; gi fi is the most general member of J. Use of the product rule and the 
fact that f;(x) = 0 for each i shows that ay *) = ree (x) 54). Since the 
gi are arbitrary, we can arrange for (gi (x), ..., 8n(x)) to be any given member 
of K”. Thus the space of all row vectors ( PE (x) ee tL (x)) for f € J is 
the set of all K linear combinations of row vectors ( Fe (x) tee FE ()) for 


1 <i </,as asserted. 


>In terminology to be used in later chapters, one says that V (/) is the affine variety corresponding 
tol. 
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EXAMPLES. 


(1) Irreducible affine curve® in K*. Suppose that n = 2 in the notation 
above and that / is nonzero and is generated by a single nonconstant polynomial 
tf (X, Y). The condition that J be prime is exactly the condition that f(X, Y) 
be a prime polynomial. In turn, since K[X, Y] is a unique factorization domain, 
the condition that f(X, Y) be prime is exactly the condition that f(X, Y) be 
irreducible. Let us specialize to a case for which the first partial derivatives take 
an especially simple form: suppose that 


f (X,Y) =Y? —A(X). 


The only possible factorization is f(X, Y) = (Y + /h(X) )(Y — /h(X)), and 
thus f(X, Y) is irreducible in K[X, Y] if h(X) is not the square of a member 
of K[X]. The relevant integral domain is R = K[X, Y]/(f(X, Y)), and we let 
x =X+(f(X,Y)) and y = Y + (f(X, Y)). Then x is transcendental over 
K, and the equation y? = h(x) shows that y is algebraic over K(x). Hence 
tr.deg R = 1, and the corresponding V(/) has dim V(/) = 1. If (xo, yo) is a 
point of V (J), then the matrix of first partial derivatives is 
af af 


——— ey — —h/ 4 2Y Bee 
oy oY Vea ( (x) mr 0,0) 


The rank of this matrix is < 1, and nonsingularity of (xo, yo) means that the 
matrix has rank equal to 1. If the characteristic is 4 2, then the condition for a 
singularity is that Yo = h(xo), yo = O, and h’(x9) = 0 simultaneously. Hence 
V (J) is everywhere nonsingular’ if and only if h has no multiple roots in K. 


(2) Irreducible affine hypersurface® in K”. For general n, again suppose that 
I is a prime ideal generated by a single nonconstant polynomial f(X1,..., Xn). 
The condition on f for J to be prime is that f be irreducible in K[X1,..., Xn]. 
The relevant ring is R = K[X,,..., Xn]/(f(X1,..., Xn)), and the image in R 
of a polynomial g(X),..., X,) is 0 only if g is divisible by f, by Corollary 
7.2. The polynomial f is nonconstant in some X;, say for j = n. Then 
no nonzero polynomial g(X;,..., X,-1) maps to 0 in R. Consequently the 
elements x; = X; + (f(X1,..., X,)) have the property that {x,,...,x,—1} is 
a transcendence set in R. The equation f(x1,...,Xn) = 0 shows that x, is 
algebraic over K (X1,..., Xn—1). Hence the corresponding V (/) has dim V /) = 
tr.deg R = n — 1. The nonsingular points of V(/) are the points of V(/) for 
which some first partial derivative of f is nonzero. 


®Some authors include irreducibility in the definition of “affine curve.” This book does not. 

7If K has characteristic 2 and if xo has the property that h’ (xo) = 0, then we can choose yo with 
ye = h(xo) because K is algebraically closed, and (xo, yo) will be a singular point. Hence V(/) is 
everywhere nonsingular if and only if # has degree exactly 1. 

8Some authors include irreducibility in the definition of “affine hypersurface.” This book does 
not. 
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Theorem 7.23 (Zariski’s Theorem). With K algebraically closed, let J be a 
prime ideal in K[X1,..., Xn], let R = K[X1,..., Xn]//, and let V(/) be the 
locus of common zeros of J in K”. If x = (x1, ..., Xn) is a point of V (/), define 
m, to be the maximal ideal 


m ={f €R| f(x) =0} 


of R, let R, be the localization of R with respect to m,, and let M,. be the maximal 
ideal of R,. Then 


dimx (M,/M?) = dimg(m,/m?2) > dim V(J), 


and x is nonsingular if and only if equality holds. The set of nonsingular points 
of V (J) is nonempty. 


REMARKS. We are going to prove for each point x of V (/) that 
dimg (M,/M?) = dimg (m,/m?) 


and that 
dimx (m,/m2) + rank Lax] =n, 


where { f;} is a finite set of generators of J. Since by definition x is nonsingular if 


and only if rank [24] =n—dim V (J), it will follow that x is a nonsingular point 
J 


if and only if dim, (m,./ m2) = dim V(/). Only for the special case that V (/) is an 
irreducible affine hypersurface do we prove that the inequality dimx (m,/m?) > 
dim V (/) always holds for all x and that equality always holds for some x. The 
general case will ultimately be reduced to the special case; we return to this matter 
in Chapter X. The partial proof that we give in the present section will be preceded 
by an example. 


EXAMPLE 1, CONTINUED. Suppose that an affine variety V in K? is obtained 
from the irreducible polynomial f(X, Y) = Y ? _ h(X). Let us assume that K 
has characteristic 4 2 and that (0, 0) lies in V. The latter condition means that 
h(O) =0. Letx = X+ (f(X,Y)) andy = Y 4+ (f(X,Y)). Since y* = A(x), 
any polynomial in (x, y) can be rewritten in such a way that the only powers of 
y that occur are 0 and 1. Thus R = {p(x) + yq(x) | p € K[x],q¢ € K[*x]}, and 


mo,o = {xp(x) + yq(x) | p € K[x], g € KIx]}. 


The ideal M.0) consists of all sums of products of two elements of this kind. 
From two polynomials xp(x), we can get any polynomial x?a(x); from xp(x) 
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and yq(x), we can get any xyb(x); and from two polynomials yg (x), we can get 
any y7c(x) = h(x)c(x). Thus 


™o.0) = {x°a(x) + A(x)c(x) + yxb(x)}. 


What happens depends on the first-degree term in h(x). Examining the possibil- 
ities, we see that 


ic | {xa(x) + yxbQx)} if h'O) £0, 
0) | bax) + yxb(x)} if hh) = 0. 


Hence 


> [Ky ifn’) £0, 
™0,0)/Mo,0) = 


Kx+KY if h’(0) = 0. 
In other words, dimx mo,0)/ Mo.) equals 1 if (0, 0) is nonsingular and equals 2 if 


(0, 0) is singular. Since dim V (/) = 1, this result is consistent with the statement 
of Theorem 7.23. 


PARTIAL PROOF OF THEOREM 7.23. As mentioned in the remarks, one thing 
that we are going to prove for each point x of V(/) is that 


dimx (m, /m?) + rank Lax] =n, (x) 


where {f1,..., fi} is a finite set of generators of J. 
Let J, be the pullback to K[X1,..., Xn] of the ideal m,, i.e., let 


L=4Fil fare tye a(S eR ised | Pa aa) ae 


The K linear mapping f + f +/ carries J, onto m,; composing with the 
quotient mapping m, — m,/m2 gives a K linear mapping ¢ of J, onto m,/m?. 
If f maps under ¢ to the 0 coset, then f + / = pa (gj + 1)(hj + J) for suitable 
polynomials g; and h; with gj + J andh; +/ inm,. Then f — Fy gjh; liesin I, 
and f is exhibited as a member of J? + J. Conversely g does carry I? and J to 
the 0 coset. Thus the kernel of ¢ is exactly J : +], and @ descends to a K linear 
isomorphism J, /(/? + 1) = m,/mz. Therefore 


dimg (1; /U; +1) = dimg (m,/m?). (4x) 


We define a K linear map @ of K[X1,..., X;] to the space M),(K) of all 
n-dimensional row vectors over K by 


a(f)=(4@) «-- oe). 
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The product rule for differentiation shows that 6 / 2) = 0. The ideal I, ,considered 
as a K vector space, is spanned by J? and the various polynomials Xj; —x;. Since 
0(X; — x;) is the Fh standard basis vector of M),(K), the vectors 0(X; — xj) 
form a basis of M,,(K). Therefore 6 descends to a K linear isomorphism 
6: 1,/12 > Mi)(K). 

We observed just before Examples | and 2 that the vector space of all row 
vectors 6(f) for f € I equals the row space for the matrix [ #2]. Hence 

J 
i = in 
dimx 0([) = rank [ 5x]. 

Since 0(J) = A(U + Le) and since @ is one-one, this equality shows that 


dimg ((1 + I2)/12) = rank [24]. (+) 


j 
Adding (x) and (+) gives 


dim (1, /I;7) = dimg (m,/m2) + rank [ 24]. 
J 


Since, as we have seen, ei ks is isomorphic to M1,(K) via 6, the left side is n, 
and («) is proved. 
The second thing that we are going to prove now is that 


dimg (m,/m2) = dimg (M,/M2). (Gap) 


If L is the field of fractions of the integral domain R, then the localization R, is the 
subring of L of all quotients g/h with g andh in R and h(x) 4 0. The inclusion 
m, © M, induces a K linear ring homomorphism ¢ : m, /m2 —> M, /M?, and 
(++) will follow if g is shown to be one-one onto. 

If g/h is given in M, with g € m, and with h € R having h(x) 4 0, then the 
decomposition 


hog =f + (QC) 


exhibits h(x)~'g in m, as mapping to g/h + M2. Therefore ¢ is onto. 


If g in m, maps to >; (#£) (3) in M2, then we can clear fractions and write 


hg = >; gig;h for an element h of R with h(x) #0. Here >, gigih! is in m2. 
The set of elements f in R such that fg is in m2 is an ideal in R that contains m, 
and that contains h. Since h is not in m, and since m, is maximal, this ideal in R 
contains f = 1,and it follows that g is in m2. Consequently g is an isomorphism, 


and (++) is proved. 


434 VII. Infinite Field Extensions 


PROOF OF REMAINDER OF THEOREM 7.23 FOR IRREDUCIBLE AFFINE HYPER- 
SURFACES. Let J be the principal ideal (f(X1,..., Xn)), where f is irreducible. 
We saw in Example 2 above that dimV(/) = n — 1. The matrix that appears 
in (*) has only one row, corresponding to f, and hence it has rank 1 or rank 0. 
Substituting this fact into («), we see that dim x (m,. /m?) >n—l=dimV(). 

Arguing by contradiction, suppose that strict inequality holds for every x in 
V(). Then ax, *) = 0 for all x € V(J) and for all j. By aoe 7.2, each 


a is the eeodadl of f and a polynomial. yee the degree of 2 sy in X; is less 
than the degree of f in X,, it follows that 2F = 0 for all 7. In Gas 0, 


this condition forces f to be constant and “coueiailiels the assumption that f is 
an irreducible polynomial (and in particular the assumption that f is not a unit). 
In characteristic p, this condition forces each power of each X; that occurs in 
f to be a multiple of p. That is, it says that f(X1,..., Xn) = g(X?,..., XP). 
Let Fr: K — K be the field map given by a + a”. This is onto K, since 
K is algebraically closed. Hence there exists a polynomial h(X,,..., X,) such 
that A=, Then f Xing kyl) = BOG pan) =: (ay eee 60) ig 
exhibits f as reducible, contradiction. Hence strict inequality cannot hold for all 
x € V(J), and some point of V (/) is nonsingular. 
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In this section, K denotes a field, and Kg denotes a fixed algebraic closure of 
K. We define K ep to be the subfield of all elements of Kaj, that are separable 
over K. The field K ep is called a separable algebraic closure of K. Theorem 
7.13 shows that K,jg is a purely inseparable extension of Ksep. If F, and F2 are 
any fields with F; C Fo, then the group of all field automorphisms of F> fixing 
F is denoted by Gal(F2/F1) and is called the Galois group of F> over F}. 

The purpose of this section is to extend the theory of Galois groups to handle 
infinite extensions. Such an extended theory has at least two important applica- 
tions in the current context. A first application is to developments in algebraic 
number theory beyond what appears in Chapters V and VI. For example one way 
of viewing traditional class field theory for a number field F is that one forms 
Gal(Faig/F’), defines the maximal abelian extension Fy of F to be the fixed 
field of the closure of the commutator subgroup of Gal(Fiig/F), and asks for a 
description of Fy» in terms of F. A second application is to the study of varieties 
over fields that are not algebraically closed. Ifa field K is given and a prime ideal 
T in KaiglX1,..., Xn] is specified by giving a finite set of generators, we can ask 
whether the same ideal can be defined via generators that lie in K. The given 
generators have coefficients in K,jg, and it is usually not obvious whether they 
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can be adjusted to have coefficients in K . However, if Galois theory is available, 
then the question becomes whether the operation of each element of the Galois 
group Gal(K,i,/K) carries each generator into a member of the ideal,? and this 
question is decidable by methods to be discussed in Chapter VIII. More generally 
algebraic geometry from before 1960 frequently worked with a field K and an 
algebraically closed field L that is larger than Kg, for example with K = Q and 
L =C. Under the assumption that K is perfect and L is algebraically closed, 
Theorem 7.34 below shows that Gal(L/K) fixes only the elements of K , and thus 
Galois theory can still be used to decide in this situation whether a prime ideal in 
L[X1,..., Xn] is generated by members of K[X1,..., Xn]. 

The definition of “normal field extension” in Basic Algebra was limited to 
suitable finite separable algebraic extensions. We now drop both the finiteness 
assumption and the separability assumption: A field L with K C L C Kajg is 
said to be a normal extension of K if there exists some nonempty family { fj}ics 
of nonconstant polynomials in K[X] such that L is generated by K and all the 
roots in Kaig of all the polynomials f;. More specifically all the polynomials f; 
split in Kalg, say as f;(X) = cj ig eS (X — aj;), and L is to be the subfield of 
Kaz generated by K and all the roots a;;. 


Proposition 7.24. The following conditions on a field L with K C L C Kaig 
are equivalent: 


(a) L is anormal extension of K, 

(b) Gal(Kaig/K) carries L to itself, 

(c) any K isomorphism of L into Kajg carries L to itself, 

(d) any polynomial f in K[X] that is irreducible over K and has one root in 
L necessarily splits in L. 


PROOF. If (a) holds, let L be generated by K and elements a;; as in the 
paragraph before the proposition. If g is in Gal(Kaig/K ), then y(q@;;) is a root of 
f’ = fi because f; has coefficients in K. Hence a; equals some aj;. Thus y 
permutes the generators of L over K , and g(L) = L. Therefore (b) holds. 

If (b) holds, then any K field map of L into Kaig extends to a K automorphism 
of Kaig, by Theorem 9.23 of Basic Algebra. By (b), the extended mapping carries 
L into itself. Thus (c) holds. 

If (c) holds, let f in K[X] be irreducible over K, and suppose that x9 is a 
root of f in L. Let x; be another root of f in Kaj. By the uniqueness of 
simple extensions, we know that there exists a K isomorphism ¢ : K (x0) > 
K (x1) © Kaig, and we can regard go as a K field map of K (x9) into Kaig. The 
map ¢ extends to a K field automorphism of K,ig, and we restrict the extension 


°This condition is always necessary. For it to be sufficient, one has to show that the only members 
of Kalg fixed by all elements of Gal(Kaig/K) are the members of K. 
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toamapg: L — Kas. By (c), p(L) C L. Since K(x) © L, we obtain 
K (x1) = 9(K (x0)) € @(L) © L. Thus x, is in L, and (d) holds. 

If (d) holds, then for each element x; of L, let f; be the minimal polynomial of 
x; over K. Then certainly L is generated by K and the elements x;. By (d), each 
f; splits in L. Therefore L is generated over K by all the roots of the polynomials 
fj and is normal. Thus (a) holds. 


Proposition 7.25. Every member of Gal(Kaig/K) carries K ep into itself, any 
two members of Gal(K,yig/K) that agree on K ep are equal on Kajg, and any field 
map Of Kep into Kaig extends to an automorphism of Kzijg. Consequently the 
operation of restriction from Kajlg to Ksep defines an isomorphism 


Gal(Kalg/K) = Gal(Kep/K). 


PROOF. The first statement has three conclusions to it. For the first conclusion, 
if g is in Gal(Kaig/K) and if xo is in K ep, let f be the minimal polynomial of xo 
over K. By separability, f is a separable polynomial over K. Since ¢ fixes f, 
y carries x9 to some root x; of f, and hence f is the minimal polynomial of x, 
over K. Since f is a separable polynomial over K , x; is separable over K and 
lies in Keep. 

For the second conclusion, let g be amember of Gal(Kaig/K) that is 1 on K ep. 
If x is in K,jg, then the pure inseparability of Kaig/Ksep implies that xP’ =a for 
some ad € Kgep and some integer e => 0. The element x has (X — x) = 
XP — xP = XP — a and hence is the unique root of XP — a. Since g(x) has 
to be a root of this polynomial, g(x) = x. 

The third conclusion is a special case of the extendability to all of Kai of any 
field mapping of a subfield of K,ig into K,jg. For the displayed isomorphism 
the first conclusion shows that restriction carries Gal(Kaig/K ) into Gal(K sep/K), 
the second conclusion shows that restriction is one-one, and the third conclusion 
shows that restriction is onto. 


Corollary 7.26. Let L be a field with K C L © Keep, form Gal(L/K), and 
let LS*4/*) be the fixed field 


L&EIK) — fy EL | yx =x forall x € Gal(L/K)}. 


Then L is normal over K if and only if LO"4/*) = K, 


ProoF. Let L be normal over K, let x be in L*"4/*) | and let f be the minimal 
polynomial of x over K. Since L is normal, f splits in L. Since L C Kep, the 
roots of f in L all have multiplicity one. Arguing by contradiction, suppose 
that x is not in K. Then deg f > 1, and f has another root x; besides x. 
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Hence we can find a K isomorphism g : K(x) > K (x1) with g(x) = x1. The 
mapping ¢ extends to a field automorphism of K,ig, and Proposition 7.24 shows 
that g(L) = L, since L is normal. Thus ¢ defines by restriction a member of 
Gal(L/K). Since g(x) = x;, we have a contradiction to the assumption that x is 
in L@IC/K) — K. 

Conversely let LS™4/K) — K. Let x be in L, and let f be its minimal 
polynomial over K. Let x; = x and x2,...,x, be the distinct images of x in 
L under members of Gal(L/K). These are all roots of f, and the roots of f 
have multiplicity 1 because x lies in Ke). Each member of Gal(L/K) permutes 
X1,...,X, and hence acts via a permutation in the symmetric group G,. Put 
g(X) = []j_, (X — x). Expanding g gives 


g(X) =X" — (Yi) X71 + (YE xix) x"? — + (TT 45). 


i i<j i 


Each permutation of {x,, ..., x,} fixes the coefficients of g(X), which are mem- 
bers of L, and hence the coefficients are in LS4/5) — K. Therefore g(X) is 
in K[X]. Since g(x) = 0, f(X) divides g(X). Over L, g(X) splits. By unique 
factorization in L[X], f(X) must split, too. By Proposition 7.24, L is normal 
over K. 


To obtain a version of the Fundamental Theorem of Galois Theory in the 
present context, it is necessary to introduce a topology on each Galois group. An 
example will illustrate. 


EXAMPLE. Let K be the finite field F,, where g = p’ for a prime p. If L, 
is a finite extension of K of degree n, then Proposition 9.40 of Basic Algebra 
shows that Gal(L,/K) is cyclic of order n, a generator being the Frobenius 
element Fr, defined by Fry(x) = x?. The thing about the Frobenius element is 
that it really makes sense on all L,,’s simultaneously. We know (from Proposition 
7.15 for example) that every algebraic extension of K is separable, and hence 
Ksep = Kaig. Here we can view Keep as an aligned union of the fields L, for 
n = 1,and Fr, really makes sense as a member of Gal(K sep /K) under the same 
definition: Fr, (x) = x7. On each L,, some nonzero power of Fr, is the identity, 
but this is no longer true on the infinite field Kseyp. Thus the mapping 1 +> Fry 
extends to a one-one homomorphism of Z into Gal(Ksep/K ). However, it is not 
onto. Any element y of Gal(Ksep/K) has the property that for each n, there is 
a unique integer k, with 0 < k, <n such that y | L, = = Fre , and the sequence 
{k,} determines y; nevertheless Problem 3 at the end of the ehanier shows that 
the sequence need not ultimately be constant, and therefore y need not be in 
the image of Z. The Galois group Gal(K sep /K ) is instead a certain topological 
completion of Z that is usually denoted by Z. Taking the topology into account 
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will be essential to extending the Fundamental Theorem of Galois Theory, since 
Z and Z are distinct subgroups of Gal(Ksep/K) that have the same fixed field, 
namely K itself. 


If L isanormal extension of K with L © K ep, we shall introduce a topology on 
Gal(L/K) to make “close” mean “equal on a large finite-dimensional subspace.” 
With this intuition as a guide, we could define a basic neighborhood of an element 
yo of Gal(L/K) by taking finitely many elements a1, ...,@, in K and forming 

{y € Gal(L/K) | ya; = you; for 1 <i <n}. 
It is more useful, however, to define the topology in another way, and then it will 
turn out that we indeed would have obtained a neighborhood basis by the above 
definition. In any event, the topology turns out to be compact Hausdorff and to 
make Gal(L/K) into a topological group. 

The method we use will be to define the topology as an “inverse limit.” In- 
verse limit is a general notion in category theory defined by a universal mapping 
property. As usual it consists of an object and a morphism; it need not exist in a 
general category, but when it does exist, it is unique up to canonical isomorphism. 
For the category of interest, the objects are the compact (Hausdorff) topological 
groups, and the morphisms are continuous group homomorphisms. If we wanted 
to emphasize the category-theory aspects of the construction, we would also need 
products of this category with itself, but we shall not belabor this point. 

Let J be a directed set, i.c.,a nonempty partially ordered set under an ordering 
< such that for any a and b in J, there is an element c in J witha <candb <c. 
We allow ourselves to write b > a in place of a < b whenever convenient. Two 
examples of directed sets of particular interest both have J = {1,2,3,...}; in 
one case the ordering is given by a < b if a divides b, and in the other case the 
ordering is given by the usual notion of inequality. 

An inverse system (J, {G;}, {fij}) in the category of compact topological 
groups consists of a directed set J, a system of compact topological groups G;, 
one for eachi € J, and a system of continuous homomorphisms fj; : G; > Gi, 
defined whenever i and j are in J withi < j, such that 

e fii = 1 foralli € J, 
e fij ° fix = fix wheneveri <j <k. 


EXAMPLES. 

(1) Let J = {1,2,3,...} witha < b meaning that a divides b. Let G, be the 
cyclic group Z/aZ of order a. Define fa, : Gy — Ga, to be the homomorphism 
such that fa,(1 + bZ) = 1+ aZ. 

(2) Let J = {1, 2,3, ...} with the usual ordering. Fix a prime number p, and 
define G, to be the cyclic group Z/p°Z of order p*. Define fy, : Gp > Ga to 
be the homomorphism such that f,,(1 + p?Z) = 1 + p%Z. 
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An inverse limit (G, {fi}ic7) of the inverse system (J, {G;}, {fij}), often 
written G = lim G; and sometimes also called the projective limit, consists 


of a compact topological group G and continuous homomorphisms f; : G > G; 
such that 
@) fij o fj = fi wheneveri < j, 
(ii) whenever (G’, { f/}i<7) is a pair consisting of a compact topological group 
G’ and continuous homomorphisms f; : G’ + G; suchthati < j implies 
fij of j = f/, then there exists a unique continuous homomorphism 
F : G’ > Gsuch that f; o F = f/ for alli. 


In the two examples the inverse limit group in the first case is Z; in the second 
case the inverse limit is isomorphic to the additive group Z, of p-adic integers. 
In the first case we omit a description of the homomorphisms f; : ZZ /aZ. In 
the second case the homomorphisms /f, are easy to describe: fy : Zp» > Z/p*Z 
is given by the composition of the quotient homomorphism Z, — Z)/p*Z) and 
the isomorphism Z, /p“Z, — Z/p*Z asserted by Theorem 6.26e. 


Proposition 7.27. In the category of compact topological groups, an inverse 
system (J, {G;}, {fi;}) has at least one inverse limit, namely (G, { fj};<7) with 


G= {(sidier € [[ Gi| fij(g;) = gi whenever i < ij, 
iel 


f; = restriction to G of the i™ projection [] G; > Gj. 
J 


REMARKS. It is to be understood from the statement that G gets the relative 
topology from [[;., G;. We refer to this (G, { f;}i<7) as the standard inverse 
limit of (7, {Gi}, {fij})- 


PRrooF. If (g;)ie7 and (g;)icy are in G, then the fact that each fj; is a homomor- 
phism implies that f;;(gj;g;) = gig; and that fi; Gy = g. Therefore (g/g; ies 
and (g; ')ic are in G, and G is a group. The subset of G; x G; with fij(xj) = xi 
is topologically closed, and it follows that G is the intersection of closed sets and 
hence is closed. Since I], <, Gj is compact Hausdorff, G is compact Hausdorff. 
The continuity of the multiplication and inversion is a consequence of those 
properties for I], <, Gj. The i™ projection of ITje ; Gj onto G; is a continuous 
homomorphism, and hence so is the restriction of this projection to G. 

Condition (i) in the definition of inverse limit is immediate, and we have to 
prove (ii). Let (G’, {f/}icer) be given with each f/ : G’ — G; having the 
property that i < j implies fj; o f/ = f/. For each g’ in G’, the /-tuple 
(f/(g'))ier is a member of []; G;, and the map g’ + (f/(g'))icy is continuous 
into the product topology because each entry is continuous. If i < j, then 
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the tuple (f/(g'))ie7 has the property that fi (F)(8')) = f(g’) because of the 
given compatibility condition for the f/’s. Therefore the map F given by g’ > 
(f/(g'))ier has its image in the subset G of |]; G;, and it is evidently a continuous 
group homomorphism. The map F' proves the existence assertion in (ii) because 
fio F(e') = fil(Fi(e')jer) = ff(@’)- 

For uniqueness, suppose that H : G’ —> G is a continuous homomorphism 
such that f; oH = f/ for alli. For each g’ € G’, we have f;(H(g’)) = f/(g’). 
Thus H(g’) is the member (g;)jc7 of [];<; Gi for which g; = f/(g’) for all i. 
Hence #H is uniquely determined. 


ie] 


Proposition 7.28. In the category of compact topological groups, any two 
inverse limits for an inverse system (/, {G;}, { fi;}) are canonically isomorphic. 


PROOF. This is a special case of the uniqueness in category theory of objects 
having a specific universal mapping property, as established in Basic Algebra. 


It is important in applications that the inverse limit of an inverse system of 
compact groups depend only on what happens far out in the directed set. We have 
not yet used that the indexing set is a directed set, rather than merely a partially 
ordered set, and we shall use this property now. 


Corollary 7.29. Let J be a directed set, let jp be in J, and let I’ be the set of 
members of / that are > jo. If UW, {Gi}, {fij}) is an inverse system of compact 
groups, then the two inverse systems (/, {G;}, {fij}) and (/’, {Gi}, {fi;}) have 
canonically isomorphic inverse limits, the isomorphism of the standard inverse 
limit G C [];-; G; onto the standard inverse limit G’ C [],;. , G; being given 
by projection to the coordinates > jo. 


i= jo 


Proor. Let P : G — G' be the projection, and let f/ : G’ > G; fori > jo 
be the associated maps. Certainly f/ o P = f; fori > jo. We shall extend the 
definition of f/ to apply to alli € J. Ifi € J is given, we use the fact that J is 
directed to choose i’ with i’ > i andi’ > jo. Define f/ = fii o f;,. Let us see 
that f/ is well defined. Let i” have i” > i andi” > jo. Choose i” with i” > i’ 
andi” > i”. The computation 


Sir [e) roe = Sia’ @) Sin 1e) I => Sia 1) fa 


shows that i’ and i” yield the same definition of f/, and a similar argument 


shows that i” and i” yield the same definition. Therefore i’ and i” yield the same 
definition. Thus f/ is now defined for all i in /. 

We shall show that (G’, { f/}ic7) is an inverse limit of (7, {G;}, { fij;}), and then 
the corollary follows from Proposition 7.28. Property (i) of inverse limits is built 
into the definition of the homomorphisms f/. For property (ii) of G’, suppose that 


6. Infinite Galois Groups 441 


(G, {fi} ics) iS a pair consisting of a compact topological group G and continuous 
homomorphisms fi “G=s G; such that i < j implies fj; © hi = fi . By (ii) for 
existence with G, find a continuous homomorphism F : G > Gwith fooFk= fi 
for all i. Substituting from f/ o P = f;, we obtain f/ 0 (Po F) = fi and this 
says that Po F: G — G’is the map we seek for the existence in (ii) for G’. For 
uniqueness in (ii), suppose that F’ : G — G’' satisfies fioF' = fi forall i. Then 
fio F' = f/o(P oF) fori > jo. By (ii) for uniqueness with G’, F’ = Po F. 
This says that the map from G to G’ in (ii) is unique. 


Let us now apply these considerations to topologize Galois groups of infinite 
separable normal algebraic extensions. The topologized Galois group will be the 
inverse limit of finite Galois groups, each with the discrete topology.!° 

We return to our field K, its algebraic closure K,jg, and its separable algebraic 
closure Ksep within K,jg. Let L be a field with K C L C Kgep, and assume that 
L/K is anormal extension, not necessarily finite. We shall topologize Gal(L/K). 
Let x be any element of L, and let F be the finite extension F = K (x) of K. If 
f is the minimal polynomial of x over K, then f has a root in Z and must split in 
L because L/K is normal. Let x;,..., x, be the roots of f, with x; = x. Then 
E = K(x,,...,Xn) is a finite normal extension of K wih K CF CECL. 
Since x is arbitrary in L, L is the union of all the finite normal extensions of K 
lying within L. 

For each pair (E, E’) of normal extensions of K with K C EC E’ CL, 
Proposition 7.24 gives us restriction homomorphisms ggp : Gal(E’/K) > 
Gal(E/K). We write gg for the special case that E’ = L, so that ger = Qe. 
IfK CECE'’CE” CL, then ger o Gegr = Gee”, and consequently the 
system 


E finite normal 
extension of K 7 , {Gal(E/K)}, {gee} 
inL 

is an inverse system of (discrete finite) topological groups. Meanwhile, we can 
form the group Gal(L/K) and the system {gz} of homomorphisms with gg = 
PEL- 


Proposition 7.30. With the above notation, the group Gal(L/K) may be 
identified with the underlying abstract group of the inverse limit jim. Gal(E/K), 
<_ 


taken over finite normal extensions E/K with E C L, in such a way that the 
homomorphisms g¢ become the homomorphisms of the inverse limit. 


'0The inverse limit of a finite group is called a profinite group. Profinite groups have special 
properties by comparison with general compact groups, but it will not be necessary for us to undertake 
a study of them. 
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PROOF. Let G = jim. Gal(E/K), put Gg = Gal(E/K), and regard G as the 


standard inverse limit given as in Proposition 7.27: 
G={(ve)e €T]~Ge | vee’ (Ye’) = Ye Whenever E C E’}. 


For each E,, we have a homomorphism gg : Gal(L/K) — Gz, and the product 
of the values of these defines a homomorphism ® : Gal(L/K) > [| Ge. The 
relations gzE O Ye’E” = GEE” Show that the image of ® is contained in the 
subgroup G of [| Gz. We shall show that ® : Gal(L/K) — G is one-one 
onto. 

Let us see that ® is one-one. If y # 1 is in Gal(L/K), then there exists x € K 
with y(x) 4 x. Let E bea finite normal extension of K within L containing x. 
Then Ve # 1, and thus g¢(y) 4 1. Hence ®(y) ¥ 1, and ® is one-one. 

Let us see that ® is onto G. Let (vz)z € G be given. For x in L, choose a 
finite normal E with x € E and E C L,and define y(x) = yg(x). The relations 
among the gzz7 show that this definition of y (x) is independent of the choice of 
E, and y is therefore a field map of L into itself. Certainly y fixes K, and we 
can construct an inverse to y from the mappings y,;- ' Thus y is in Gal(L/K). 
Application of ® gives ®(y) = (ge(Y))z = (Ve)z, and ® is onto. 


Using Proposition 7.30, we transfer the topology from jim. Gal(E/K) to 
<_ 


Gal(L/K), and we can now regard Gal(L/K) as a compact topological group. 
For any finite normal extension F of K with F C L, consider the group 
Gal(L/F). The inverse-limit topology identifies Gal(L/K) with a subgroup 
of [| ~5, Gal(E/K ), the product being taken over all finite normal extensions E 
of K contained in L, and Corollary 7.29 allows us to identify Gal(L/K) with a 
subgroup of 

[] Gal(E/K), 


EDF 


the product being taken over all finite normal extensions E of F contained in L. 
Under this identification Gal(L/F) is identified with the subgroup of elements y 
of the image of Gal(L/K) for which gr (y) = 1. Since @- is continuous, this is 
a closed set. In turn, this set equals the image of Gal(L/F) in the subset 


[] Gal(Z/F). 


EDF 


The latter gives the standard inverse limit topology on Gal(L/F). Except for 
some details, the conclusion is as follows. 
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Corollary 7.31. With the notation of Proposition 7.30, give Gal(L/K) the 
inverse-limit topology. If F is a finite normal extension of K contained in L, 
then Gal(L/F) is a closed subgroup of Gal(L/K), and the relative topology on 
Gal(L/F) coincides with the inverse-limit topology of Gal(L/F). The subgroup 
Gal(L/F) of Gal(L/K) is anormal subgroup of finite index in Gal(L/K). Being 
a closed subgroup of finite index, it is an open subgroup. 


PROOF. We still need to prove that Gal(L/F) has finite index in Gal(L/K). 
Proposition 7.24 shows that the restriction to F of any member of Gal(L/K) is 
an automorphism of F’. Since F is a finite extension of K ,, there are only finitely 
many possibilities for this automorphism. If two elements y and y’ of Gal(L/K) 
restrict to the same automorphism of F, then y~!y’ is a member of Gal(L/K) 
fixing F’, i.e., a member of Gal(L/F). Thus y’ lies in the coset y Gal(L/F), 
and we conclude that there are only finitely many cosets. Since every member of 
Gal(L/K) restricts on F to an automorphism of F’, the subgroup of members of 
Gal(L/K) restricting to the identity on F is a normal subgroup. Thus Gal(L/F) 
is normal in Gal(L/K). 


Corollary 7.32. With the notation of Proposition 7.30, Gal(L/K ) has a system 
of open normal subgroups with intersection {1}. Hence the same thing is true of 
any closed subgroup of T of Gal(L/K). Moreover, if U is any open neighborhood 
of 1 in T, then some open normal subgroup lies in U; consequently the open 
normal subgroups of T form a neighborhood base about the identity. 


PROOF. The open normal subgroups in the first conclusion are the subgroups 
Gal(L/F) as in Corollary 7.31. Since every member of L lies in some finite 
normal extension of K within L, a member of Gal(L/K) cannot lie in every 
Gal(L/F) unless it is the identity on L. 

Let U be an open neighborhood of 1 in the closed subgroup T of Gal(L/K). 
The set-theoretic complement U* of U in T is acompact set, and the complements 
of the open normal subgroups of T are open sets whose union covers U“, by the 
result of the previous paragraph. By compactness finitely many complements of 
open normal subgroups of T together cover U°. The intersection of these open 
normal subgroups is then an open normal subgroup contained in U. 


Theorem 7.33 (Fundamental Theorem of Galois Theory). Let K be a field, 
and let Kajg be an algebraic closure, so that K C Kgep © Kai. Let L bea 
normal extension of K lying in K yep. Let S be the set of all closed subgroups of 
Gal(L/K),and let F be the set of all intermediate fields between K and L. Then 
F + Gal(L/F) is a one-one mapping of F onto S with inverse $ L‘, LS 
being the fixed field within L of the group S. 


PROOF. First we show that Gal(L/F’) is closed; Corollary 7.31 shows this only 
when F is a normal extension of K. Let {F,} be the set of all finite extensions 


444 VII. Infinite Field Extensions 


of K contained in F. Then F = |), Fa, and thus Gal(L/F) = (),, Gal(L/ Fy). 
Each Fy, is contained in a finite normal extension Ey, of K lying in L, and hence 
Gal(L/Fy) D2 Gal(L/E,). Corollary 7.31 shows that Gal(L/E,) is an open 
subgroup of Gal(L/K), and hence the larger subgroup Gal(L/F,,) is open (as 
a union of cosets, each of which is open). Open subgroups are closed. Thus 
Gal(L/F,,) is closed, and so is Gal(L/F) = (|, Gal(L/Fy). 

Next if F is in F, then the inclusion L > F and the fact that L is normal over 
K together imply that L is normal over F. By Corollary 7.26, F = LOUG/?), 
Hence F +> Gal(L/F) is one-one, and S +> L* is a left inverse of it. 

Finally we show that S > Lis aright inverse by showing that Gal(L/L*°) = S 
for any closed subgroup S of Gal(L/K). Define T = Gal(L/L*). Certainly 
S CT. The previous step shows that F = LS“/") for all F ¢ F. Taking 
F = LS gives LS = LOVL/L") = LT. Let V be an arbitrary open normal 
subgroup of T, and put E = LY. The members of T/V give well-defined 
automorphisms of E, and 


Bae yy a ah aa yr ae’. (*) 


The group T/V is a finite group of automorphisms of F fixing K , and Corollary 
9.37 of Basic Algebra, when applied to the group T/ V and the separable extension 
E/E"! , shows that T/V = Gal(E/E7/"). Similarly it shows that SV/V = 
Gal(E/ESY/"). By (*), T/V = SV/V,ie., T = SV. Corollary 7.32 shows 
that the open normal subgroups of T form a neighborhood base about the identity 
of T. From the equality T = SV for arbitrary V, let us see that 


S is dense in T. (x) 


Arguing by contradiction, let g be in T but not in the closure of S. Find V small 
enough so that gV-!'N S = @. From T = SV, we can write g = sv with € S 
and v € V. Then svV~!'M § = @, and hence vV~!N S = @. This last equality 
is a contradiction, since the identity lies in vV~!, and (*x) is proved. Since S$ 
is closed, it follows from (**) that S = T. But T = Gal(L/L°*) by definition. 
Therefore Gal(L/L*) = S, and the proof of the theorem is complete. 


Theorem 7.34. Let K bea perfect field, and L be an algebraically closed field 
containing K. Then the only members of L fixed by every element of Gal(L/K) 
are the members of K. 


PROOF. Proposition 7.15 shows that K sep = Kalg, and Corollary 7.26 implies 
that the only members of Kajg fixed by Gal(Kaig/K ) are the members of K . Thus 
we are done unless L contains elements not in Kajg. 

Let x and y be any two members of L not in K,jg, and let y be in Gal(Kaig/K). 
The singleton sets {x} and {y} are transcendence sets over Kaig, and Lemma 7.6 
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shows that they can be extended to transcendence bases of L over Kajg. Call 
these transcendence bases E and F,, respectively. Theorem 7.9 shows that F and 
F have the same cardinality. Therefore there exists a one-one function g of E 
onto F such that g(x) = y. This function g extends uniquely to a field map 
® of Kaig(E) onto Kaig(F’) that restricts to y on Ky. Theorem 7.7 shows that 
L is an algebraic extension of Kaig(E) and of Kaig(F’); hence L is an algebraic 
closure of Kaig(E) and of K,ig(F’). The composition of ® followed by inclusion 
is a field map of Kaig(E) into L, and Theorem 9.23 of Basic Algebra shows that 
it can be extended to a field map ® of L into L. Since B(L) is an algebraic 
closure of Kajg(), ®(L) = L. Thus there exists a member ® of Gal(L/Kaig) 
with @(x) = y such that ®| x, =v. 

Taking w to be the identity shows that no element of L transcendental over K 
is fixed by Gal(L/K). If an element z of Kaig is given that is not in K,, then the 
first paragraph of the proof produces a member w of Gal(Kaig/K) that moves z. 
Applying the result of the second paragraph to this y% with x arbitrary and with 
y = x shows that y extends to a member of Gal(L/K) that moves z. 


7. Problems 


1. Let L/K be a field extension in characteristic p. Prove that the set of elements 
of L that are purely inseparable over K is a subfield of L. 


2. Incharacteristic p, let K (~) be an algebraic extension of a field K , and form the 
inclusions K C K(a?") © K(a), where a” is the smallest power of « that is 
separable over K . Prove that the subfield of separable elements in the extension 
K(a)/K consists exactly of K (a?’), i.e., that no separable elements of K (a) 
over K lie outside K (a?"). 


3. Partially order the positive integers by saying that a < b if a divides b. Let 
Z, { fa}a>1) be the inverse limit of the cyclic groups Z/aZ, with the homomor- 
phism fy» from Z/bZ to Z/aZ being given by fi,(1 + DZ) = 1+ aZ whena 
divides b. Each member c of Z defines amember z, of Z such that fac) = ct+aZ 
for all a. Exhibit some other explicit member of Z. 


4. Prove that the only members of C fixed by all members of Gal(C/Q) are the 
members of Q. What members of R are fixed by Gal(R/Q)? 


5. By making use of the field K = Q.v2, J3, xf 5; JT, ...), Show that there exist 
subgroups of Gal(Qaig/Q) of index 2 that are not open. 


Problems 6-14 concern primary ideals and make use of the notion of the radical //7 
of an ideal J as defined in Section 1. Throughout, R will denote a commutative ring 
with identity. A proper ideal J of R is primary if whenever a and b are in R, ab is 
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in J, and a is not in J, then b” is in J for some integer m > 0. It is immediate that 

every prime ideal is primary. 

6. Prove that an ideal J of R is primary if and only if every zero divisor in R/T is 
nilpotent (in the sense that some power of it is 0), if and only if 0 is primary in 
R/I. 

7. (a) Prove that if J is a primary ideal, then /7 is a prime ideal. (Educational 

note: In this case the prime ideal 7 is called the associated prime ideal 
to J.) 
(b) Prove that if J is any ideal and if J C J for a prime ideal J, then VT tay 


8. (a) Show that the primary ideals in Z are 0 and (p”) for p prime andn > 0. 

(b) Let R= C[x, y]and/J =(@, WP); Use Problem 6 to show that J is primary. 
Show that P = VT is given by P = (x, y). Deduce that P? SJ © P and 
that a primary ideal is not necessarily a power of a prime ideal. 

(c) Let K be a field, let R = K[X, Y, Z]/(XY — ny; and let x, y, z be the 
images of X, Y, Z in R. Show that P = (x, z) is prime by showing that 
R/P is an integral domain. Show that P? is not primary by starting from 
the fact that xy = 2? lies in P?. 


9. Prove that if J is an ideal such that /7 is maximal, then J is primary. Deduce 
that the powers of a maximal ideal are primary. 


10. An ideal is reducible if it is the finite intersection of ideals strictly containing it; 
otherwise it is irreducible. 

(a) Show that every prime ideal is irreducible. 

(b) Let R = C[x, y], and let J be the maximal ideal (x, y). Show that I? is 
primary and that the equality 77> = (Rx + I*) N (Ry + I’) exhibits J* as 
reducible. 

11. Prove that if R is Noetherian, then every ideal is a finite intersection of proper 
irreducible ideals. (The ideal R is understood to be an empty intersection.) 
12. Suppose that R is Noetherian and that Q is a proper irreducible ideal in R. Prove 

that 0 is primary in R/Q, and deduce that Q is primary in R. 

13. Prove that if Q,,..., Q, are primary ideals in R that all have /Q; = P, then 

Q =();_, Q; is primary with /Q = P. 

14. (Lasker—Noether Decomposition Theorem) The expression J = (};_, Q; of 
an ideal J as an intersection of primary ideals Q; is said to be irredundant if 
(1) no Q; contains the intersection of the other ones, and 
(11) the Q; have distinct associated prime ideals. 


Prove that if R is Noetherian, then every ideal is the irredundant intersection of 
finitely many primary ideals. 


CHAPTER VIII 


Background for Algebraic Geometry 


Abstract. This chapter introduces aspects of the algebraic theory of systems of polynomial equations 
in several variables. 

Section 1 gives a brief history of the subject, treating it as one of two early sources of questions 
to be addressed in algebraic geometry. 

Section 2 introduces the resultant as a tool for eliminating one of the variables in a system of 
two such equations. A first form of Bezout’s Theorem is an application, saying that if f(X, Y) and 
g(X, Y) are polynomials of respective degrees m and n whose locus of common zeros has more 
than mn points, then f and g have a nontrivial common factor. This version of the theorem may be 
regarded as pertaining to a pair of affine plane curves. 

Section 3 passes to projective plane curves, which are nonconstant homogeneous polynomials in 
three variables, two such being regarded as the same if they are multiples of one another. Versions of 
the resultant and Bezout’s Theorem are valid in this context, and two projective plane curves defined 
over an algebraically closed field always have a common zero. 

Sections 4—5 introduce intersection multiplicity for projective plane curves. Section 4 treats a 
line and a curve, and Section 5 treats the general case of two curves. The theory in Section 4 is 
completely elementary, and a version of Bezout’s Theorem is proved that says that a line and a curve 
of degree d have exactly d common zeros, provided the underlying field is algebraically closed, 
the zeros are counted as often as their intersection multiplicities, and the line does not divide the 
curve. Section 5 makes more serious use of algebraic background, particularly localizations and the 
Nullstellensatz. It gives an indication that ostensibly simple phenomena in the subject can require 
sophisticated tools to analyze. 

Section 6 proves a version of Bezout’s Theorem appropriate for the context of Section 5: if F 
and G are two projective plane curves of respective degrees m and n over an algebraically closed 
field, then either they have a nontrivial common factor or they have exactly mn common zeros when 
the intersection multiplicities of the zeros are taken into account. 

Sections 7-10 concern Grobner bases, which are finite generating sets of a special kind for ideals 
in a polynomial algebra over a field. Section 7 sets the stage, introducing monomial orders and 
defining Grobner bases. Section 8 establishes a several-variable analog of the division algorithm for 
polynomials in one variable and derives from it a usable criterion for a finite set of generators to be a 
Groébner basis. From this it is easy to give a constructive proof of the existence of Grobner bases and 
to obtain as consequences solutions of the ideal-membership problem and the proper-ideal problem. 
Section 9 obtains a uniqueness theorem under the condition that the Grdbner basis be reduced. 
Adjusting a Grobner basis to make it reduced is an easy matter. A consequence of the uniqueness 
result is a solution of the ideal-equality problem. Section 10 gives two theorems concerning solutions 
of systems of polynomial equations. The Elimination Theorem identifies in terms of Grobner bases 
those members of the ideal that depend only on a certain subset of the variables. The Extension 
Theorem, proved under the additional assumption that the underlying field is algebraically closed, 
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gives conditions under which a solution to the subsystem of equations that depend on all but one 
variable can be extended to a solution of the whole system. The latter theorem makes use of the 
theory of resultants. 


1. Historical Origins and Overview 


Modern algebraic geometry grew out of early attempts to solve simultaneous 
polynomial equations in several variables and out of the theory of Riemann 
surfaces. We shall discuss the first of these sources in the present chapter and the 
second of the sources in Chapter IX. 

Serious consideration of simultaneous polynomial equations of degree > 2 
dates to a 1750 book! by Gabriel Cramer (1704-1752), who may be better 
known for Cramer’s rule in connection with determinants. Cramer was interested 
in various aspects of the zero loci of polynomials in two variables with real 
coefficients. Thinking of the zero locus, we refer to a nonconstant polynomial in 
two variables as a plane curve. 

One of the problems of interest to Cramer was to find the number of points in 
the plane that would uniquely determine a plane curve of degree n up to a constant 
multiple. Cramer gave the answer in(n + 3) to this problem. For example, when 
n = 2, if we normalize matters by taking the coefficient of x” to be 1, then the 
possible quadratic polynomials 


fos) Sa + bxy +4 cy’ tdx +ey+ f 


involve five unknown coefficients. Each condition f(x;, y;) = 0 gives a linear 
condition on the coefficients, and Cramer was able to write down explicitly a 
plane curve through the given points in question by introducing determinants and 
applying his rule to solve the problem. 

Already with this much description the reader will see a certain subtlety —that 
there will be special choices of the five points for which existence or uniqueness 
will fail. We could also ask about the effect of multiplicities: what does it mean 
geometrically to take two or more of the points to be equal, and how does such 
an occurrence affect the number of points that can be specified? 

Cramer noticed a subtlety that is less easy to resolve, even in hindsight. If we 
are given any two plane curves of degree 3, then Cardan’s formula says that we 
can solve one equation for y in terms of x, obtaining three expressions in x; then 
we can substitute for y in the other equation each of the three expressions in x and 
obtain a cubic equation in x each time. In other words, we should expect up to 9 
points of intersection for two cubics, and 9 should sometimes occur. (The various 


'G. Cramer, Introduction a l’Analyse des Lignes courbes algébriques, Chez les Fréres Cramer 
& Cl. Philibert, Geneva, 1750. 
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forms of Bezout’s Theorem, which came a little later, confirm this argument.) The 
number of points that determine a cubic completely is in(n + 3) forn = 3,ic., 
is 9. Thus we have 9 points determining a unique cubic, and yet the second cubic 
goes through these 9 points as well. What is happening? This question has come 
to be known as Cramer’s paradox. 

Explaining this kind of mystery became an early impetus for the development 
of algebraic geometry. 

The question of the number of points of intersection had been the subject of 
conjecture for some time earlier, and it was expected that two plane curves of 
respective total degrees m and n in some sense had mn points of intersection. 
Etienne Bezout (1730-1783) took up this question and dealt with parts of it 
rigorously. The quadratic case can be solved by finding one variable in terms of 
the other and by substituting, but let us handle it by the method that Bezout used. 
If we view each polynomial as quadratic in y and having coefficients that depend 
on x, then we have a system 


ay + ayy + any* =0, 
by + biy + boy? =0. 


Instead of regarding this as a system of two equations for y, we regard it as 
a system of two homogeneous linear equations for variables xo, x1, x2, where 
xo = 1,x1 = y,X2 = y*. We can get two further equations by multiplying each 
equation by y: 


ay + aiy* +ay>? =0, 
boy + biy? + boy® =0, 


and then we have four homogeneous linear equations for x9 = 1,x1 = y,x2 = 
yr 43= an Since the system has the nonzero solution (1, y, yrs pe ), the deter- 
minant of the coefficient matrix must be 0. Remembering that the coefficients 
depend on x, we see that we have eliminated the variable y and obtained a poly- 
nomial equation for x without using any solution formula for polynomials in one 
variable. The device that Bezout introduced for this purpose—the determinant of 
the coefficient matrix—is called the resultant of the system and is a fundamental 
tool in handling simultaneous polynomial equations. With it Bezout went on in 
1779 to give a rigorous proof that when two polynomials in (x, y) are set equal 
to 0 simultaneously, one of degree m and the other of degree n, then there cannot 
be more than mn solutions unless the two polynomials have a common factor. 
This is a first form of Bezout’s Theorem and is proved in Section 2. 

In order to have a chance of obtaining a full complement of mn solutions, we 
make three adjustments — allow complex solutions instead of just real solutions 
(even in the case (m,n) = (2, 1) ), consider “projective plane curves” instead of 
ordinary plane curves to allow for solutions at infinity (even in the case (m,n) = 
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(1, 1) ), and introduce a suitable notion of intersection number of two plane curves 
at a point in order to take multiplicities into account (even in the case (m,n) = 
(2, 1) ). We shall allow complex solutions already in Section 2, and we shall make 
an adjustment for projective plane curves in Section 3. The issue of intersection 
multiplicity is more complicated. The beginnings of a classical approach to it 
are indicated in Section 4, and a somewhat more modern approach appears in 
Section 5. With the full theory of intersection multiplicities of projective plane 
curves in place, we obtain a general form of Bezout’s Theorem? in Section 6. 

The theory of the resultant can be extended in various ways, but we shall 
largely not pursue this matter. Studies of zero loci of systems of equations took 
a more geometric turn in the first part of the nineteenth century through the work 
of Julius Pliicker (1801-1858) and others, but these matters will be left for an 
implicit discussion in Chapter X. Instead, we skip to a development that began 
with the doctoral thesis of Bruno Buchberger in 1965. Buchberger was interested 
in being able to decide when a polynomial is a member of an ideal that is specified 
by a finite list of generators. For this purpose he learned that each ideal has a 
special finite set of generators that is unique once certain declarations are made. 
He devised an algorithm for determining such a set of generators,> and he gave 
the name “Grobner basis” to the set, in honor of his thesis advisor.t The special 
unique such basis is called a “reduced Grobner basis.” 

An unfortunate feature of the algorithm (and even of later improved algorithms) 
is that Grdbner bases are extraordinarily complicated to calculate. The timing 
of Buchberger’s discovery was therefore especially fortuitous, coming when 
computers were becoming more common, more economical, and more powerful. 

Buchberger was able to give a test for membership in an ideal in terms of 
a multivariable division algorithm involving any Grébner basis. Other general 
problems involving ideals were solvable as well. Because of the uniqueness of the 
reduced Groébner basis, two ideals are identical if and only if their reduced Grébner 
bases are equal. When some of the theory of resultants was incorporated into 
the theory of Grobner bases, these bases could also be used to address various 
questions of identifying zero loci. Other problems involving ideals could be 
addressed by similar methods. The theory has flowered tremendously since its 
initial discovery and by the present day has found many imaginative applications 
to applied problems. Sections 7—10 give an introductory account of this important 
theory. 


2A correct proof of the general form of the theorem seems to have been published for the first 
time by Georges-Henri Halphen (1844-1889) in 1873. 

Devising the algorithm was Buchberger’s real contribution, since the abstract existence of the 
special set of generators is an easy consequence of the Hilbert Basis Theorem and had already been 
used in papers of H. Hironaka in 1964. 

4Wolfgang Grébner (1899-1980). The name is often spelled out as “Groebner,” particularly 
when it is used in connection with computer algorithms. 
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2. Resultant and Bezout’s Theorem 
Let A be a unique factorization domain. The case that A = K[X,,..., X,] for 


a field K will be the main case of interest for us. If f and g are polynomials in 
A[X] of the form 


S(X) = fot AX +25 + fnX”, 
8(X) = got BiX +--+ + Bn", 
with m and n both positive, then we let R(f, g) be the (m +n)-by-(m +n) matrix 
Jott 822 fmsv.” tm 0 0 oO: O 
0 fo oe fn-2 fm-1 fn 0 0 ee 0 
Ove oe fo aa 8 fy 
Go. Ci Bee Ba Oh See OO 
0 §0 -"° §n-2 Sn-1 Sn °°" 0 
0 Scdms £0 £1 ata Zn 


in which there are 1 rows above the go in the first column and there are m remaining 
rows. The resultant of f and g is the determinant 


R(f, g) = detR(f, g). 


Theorem 8.1. If Ais a unique factorization domain and if f and g are nonzero 
members of A[X] of the form f(X) = )°79 f;X! and g(X) = Yj=0 g;X/ with 
m > 0 andn > 0 and with at least one of f, and gy nonzero, then the following 
are equivalent: 

(a) f and g have a common factor of degree > 0 in X, 

(b) af + bg = 0 for some nonzero a and b in A[X] with dega < n and 

degb <m. 

(c) R(f.g) =0. 
Regard R(f, g) as a constant polynomial in X. When R(f,g) 4 0, there 
exist unique a and b in A[X] such that a(X) f (X) + b(X)g(X) = Rf, g) with 
dega <n and degb < m. Both the polynomials a and b are nonzero if both 
f (X) and g(X) are nonconstant. 


REMARKS. The theorem says that af + bg = R(f, g) holds in every case 
for which at least one of the coefficients f,, and g, is nonzero. Sometimes the 
theorem appears in texts with the assumption that both coefficients are nonzero; 
in this connection, see Problem 5 at the end of the chapter. When R(f, g) = 0, 
the theorem does not point to a useful way to identify a common factor; the 
division algorithm can be used for this purpose in some circumstances, but the 
use of Grébner bases as in Section 7 will be more helpful. 
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PROOF. Let us prove the equivalence of (a) and (b). Suppose that (a) holds. 
If w is a nonconstant polynomial in X that divides both f and g, let us write 
f = bu and g = —au. Thenaf + bg = 0. Also, dega + degu = deg g; 
since degu > 0, dega < degg <n. Similarly degb < m. Thus (b) holds. 
Conversely suppose that (b) holds, so that af = —bg with a and b nonzero and 
with dega < n and degb < m. Suppose that f,, 4 0. The equality af = —bg 
shows that f divides bg. Since degb < m = deg f, f cannot divide b. But 
A[X] is a unique factorization domain, and thus there is some prime factor p of 
f of positive degree such that p* for some k divides f but not b. Then p divides 
both f and g, and (a) holds. A similar argument works if g, 4 0. 

Now we prove the equivalence of (b) and (c). Let F be the field of fractions 
of A. We set up a one-one correspondence between polynomials a(X) in A[X] 
of degree at most n — 1 and n-dimensional row vectors (a Q, «++: Qp_-1) 
with entries in A by the formula 


a(x) =ag +a,X+--- ee ay 


and similarly we set up one-one correspondences for degrees at most m — 1 and 
at most m + n — 1 by the formulas 


b(X) = Bo + BiX ++++ + BX"), 
c(X) = vot MX Hees bmn Xt, 


Examining the form of R(f, g), we see that the matrix equality 


(Qo 1 +--+ Qp-1 Bo +--+ Bm-1) RG, 8) 
=(Y% VM ++) Yntn-1) () 


holds if and only if the polynomial equality 


a(X) f(X) + D(X) g(X) = c(X). (*) 
holds. If (b) holds, then af = —bg, and (**) shows that c = 0. That is, 
(Yo Yt °** Ym+tn—1) is the 0 row vector. Interpreting () as a matrix equality 


over F and assuming that a and b are not both 0, we see that the transpose 
of R(f, g) has a nontrivial null space. Therefore R(f,g) = detR(f, g) = 
0. This proves (c). Conversely if (c) holds, then we can find row vectors 
(a 1 +++ Qy-1) and (Bo fi --- Bm-1) not both 0, having entries 
in F’,, such that the left side of (*) equals the 0 row vector. Clearing fractions, we 
may assume that(aqo a, --: Qy,-1)and(fo fr --: Bm) haveentries 
in A. Referring to («), we obtain af + bg = 0 with deg a at most n — 1 and deg b 
at most m — 1. We know that at least one of a and b is nonzero, and we have to 
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see that both are nonzero. The situation is symmetric in a and b. If a were to 
equal 0, then we would have bg = 0 and we could conclude that b = 0 because 
g #0. So we would obtain the contradiction a = b = 0. This proves (b). 

For the last statement of the theorem, suppose that R(f, g) 4 0. Then Cramer’s 
tule applied over the field of fractions F of A shows that the matrix inverse of 
Rf, g) is of the form 


Rf, g)! = R(f, g) SCS, g), 


where S(f, g) is a matrix with entries in A. Consequently the row vector 


(R(f.g) 0 ++ O)R(F,g)"" 
has entries in A, and we can define members ao, ..., @n—1, Bo, ---, Bm—1 Of A by 
COG. Of 85% eet Py wee¥ UP) 
=(R(f,g) 0 + ORF). 
Then («) holds with (yo yi +--+) Ymtn-1) = (ROU, g) O --- 0), and 


the equality («*) shows that a(X) f (X) + b(X)g(X) = R(f, g). If both f and 
g are nonconstant, then neither a(X) nor b(X) can be 0, since otherwise the 
equation would show that R(f, g) is a nonconstant polynomial. 


Theorem 8.2 (Bezout’s Theorem). Let K be any field, and let f(X, Y) and 
g(X, Y) be nonconstant polynomials in K[X, Y], of exact respective degrees m 
and n. If the locus of common zeros of f and g in K” has more than mn points, 
then f and g have a nonconstant common factor in K[X, Y]. 


PROOF. For most of the proof, we assume that K is infinite. Arguing by 
contradiction, suppose that f and g both vanish at distinct points (x;, y;) for 
1 <i <mn-+1,and suppose that f and g have no nonconstant common factor. 
Since there are only finitely many members c of K such that y; — yj; = c(x; — xj) 
for some i and j with i ~ j and since K is assumed to be infinite, we can find 
cin K such that y; — y; # c(x; — x;) for alli and j withi # j. For this c, 
yi — Cx; A yj — cx; wheni ¥ j, and therefore the second coordinates of the 
points (x;, y; — cx;) are distinct. The common zeros of f(X, Y) and g(X, Y) 
include the points (x;, y;), and thus the common zeros of f(X, Y + cX) and 
g(X, Y + cX) include the mn + 1 points (x;, yj; — cx;) whose second coordinates 
are distinct. 

In other words, there is no loss of generality in assuming that the given 
polynomials f and g vanish at mn + 1 points whose second coordinates are 
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distinct. Regard f (X,Y) and g(X, Y) as members f(X) and g(X) of A[X], 
where A = K[Y], and write 


FQR)= fot fp pet fk, 
g(X) = got giX +++ + gy X”, 


with each f; and g; in A and with f,, 40 and g,, 40. Herem’ < mandn’ <n. 

Let us rule out the possibility that m’ = 0 orn’ = 0. Indeed, if we had m’ = 0, 
then the polynomial f would be nonzero and would depend on Y alone. Since 
f is nonzero and has degree m > 1, it has at most m roots. But we are assuming 
that f and g vanish at mn + 1 points whose Y coordinates are distinct, and the 
inequalities m < mn < mn + | therefore give a contradiction. Thus m’ + 0. 
Similarly n’ 4 0. So Theorem 8.1 is applicable. 

Form the square matrix R(f, g) of size m’ +n’ and its determinant R(f, g). 
The latter is a member of K[Y], and Theorem 8.1 shows that it cannot be 0, since 
f and g are assumed to have no nonconstant common factor in K[X, Y]. 

Let us bound the degree of the member R(f, g) = det RCf, g) of K[Y]. Each 
term in the expansion of the determinant is of the form 


+ [] RA®Dicw (*) 
1<i<m/!+n’ 
for some permutation o of {1,...,m' +n}. Here R(f, g);; is given by 
js for 1 <i <n’ and for j withi < j < m' +i, 
0) for 1 <i <n’ and for all other j, 
RF, &)ii = Sj+n'—i forn’ +1 <i <n’+m’' and for j 

withi <n'’+j <m'+i, 

0 for n’ +1 <i <n'+m’ and for all other j. 


In addition, the degree of fj; as a member of K[Y] is at most m’ — (j —i), and 
the degree of gj+n/—; is at most n’ — (j +n’ —i) =i — j. Setting j = o(i), we 


see that the degree of () is at most 


Y (mM —o(f) ++ Dee (i — o(i)) 


1<i<n’ n'+1<i<m’+n' 
=m'n'—- SY of)+ DS i=m'n' <mn. 
1<i<m/+n’ 1<i<m/+n’ 


Thus R(f, g) is a nonzero polynomial in K[Y] of degree at most mn. Conse- 
quently it has at most mn roots. 

Theorem 8.1 shows that af + bg = R(f, g) for suitable members a and b of 
K[X, Y]. Recalling that f and g are assumed to vanish at mn + 1 points whose 
second coordinates are distinct, we see that R(f, g) vanishes at each of these 
second coordinates, and we arrive at a contradiction. 
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Now we can allow K to be finite. Let K’ be an infinite extension. We have 
just seen that f and g have a nonconstant factor in K’[X, Y]. Without loss 
of generality, this factor depends nontrivially on X. Theorem 8.1 applied with 
A = K’'[Y] shows that R[f, g] = 0. The same theorem with A = K[Y] 
then shows that f and g have a common factor in A[X] = K[X, Y] depending 
nontrivially on X. 


Let us introduce some geometric language for the situation in Theorem 8.2. 
Affine n-space over a field K is the set of n-dimensional column vectors 


AN = Ak, = {Ott +140) € Kite} 


with entries in a fixed algebraic closure Kalg of K. The set of K rational points, 
or K points, in A” is the subset 


Ak = {(01,...,%n) € K"}. 


We shall comment on the appearance of Kajg in these definitions shortly. 

Members of A” are called points in n-dimensional affine space, and the func- 
tions P +> x;(P) give the coordinates of the points. If L is any field between 
K and Kajg, then any polynomial f in K[X,,..., X,] defines a corresponding 
polynomial function from Aj into L. 

For algebraic geometry the case of interest for Sections 1-6 of this chapter is 
the case n = 2. The way of viewing a curve is influenced by Cramer’s thinking as 
discussed in Section 1: the particular polynomial that defines a curve is important, 
not just the zero locus in the affine plane, but two curves are to be regarded as the 
same if each is a nonzero multiple of the other. We can incorporate this viewpoint 
into algebraic language by defining an affine plane curve C over the field K to 
be any nonzero proper principal ideal in K[X, Y]. The curve is an affine plane 
line if the degree of any generator is 1. 

In practice in studying affine plane curves, there is ordinarily no need to 
distinguish between a polynomial and the principal ideal that it generates, and 
we Shall feel free to refer to an affine plane curve C = (f) as f when there is no 
possibility of confusion. 

The zero locus of a curve is the corresponding geometric notion, but it can 
readily be empty, as is the case with X* + Y? + 1 when K = R. On the 
other hand, the Nullstellensatz (Theorem 7.1) ensures that the zero locus will be 
nonempty if the underlying field is algebraically closed. Thus we define the zero 
locus V(C) = V((f)) of the curve C = (f (X, Y)) by® 


5 Warning: This definition will be changed slightly in Chapter IX and again in Chapter X to 
reflect changed emphasis in those chapters. 

©The letter “V” is the letter that is commonly used in the notation for a zero locus. It stands for 
“variety,” a notion that we have not yet defined. But beware: not all objects labeled with a “V” are 
actually varieties the way the term is normally defined. An affine plane curve will turn out to be a 
variety exactly when the generating polynomial f is prime in Kajg[X, Y]. 
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V(C) = Vieug(C) = {(, ¥) € King | f(y) = O}. 


This is the same as the set of all (x, y) such that every member of the ideal C 
vanishes at (x, y). The set of K rational points, or K points, of C is 


Vk(C) = Vk((f)) = {@, y) € K | f@, y) =0}. 


When we are content to refer to an affine curve C = (f) as f, we are content 
also to write V(f) in place of V(C) = V((f)). 

In Chapter X, under the assumption that K is algebraically closed, we shall 
extend these definitions from the case n = 2 and C as above to the case that 
n is general and C is replaced by any ideal J in K[x,,..., X,]. The set V(/) 
of common zeros of the members of J in K"” = Ki), will be called an “affine 
algebraic set.” The case of affine n-space itself arises when the ideal is 0. 

For general K , not necessarily algebraically closed, it is meaningful to consider 
the set Vx (/) of K rational points, i.e., the subset of common zeros lying in K”. 
For J = 0 and V(J) = A", the distinction between Vx (/) and Vx,,, (7) is hardly 
worth mentioning, but the distinction is well worth making for general J and is 
made for the case V(/) = A” for consistency. Although the study of sets Vx /) 
is of importance in number theory, in geometry over R, and in other areas, we 
shall not pursue it in Chapter X for lack of space. 

Returning to Theorem 8.2, we see that the statement concerns Vx (C) Vx (D), 
where C and D are the principal ideals C = (f) and D = (g) in K[X, Y]. The 
theorem says that if Vx (C) N Vx (D) contains more than mn points, then there is 
a nonzero principal ideal h with (h) C (f) N (g). 
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Section 2 dealt with intersections of affine plane curves. Even over an alge- 
braically closed field, two affine plane curves need not intersect. An example is 
the pair of straight lines X + Y — 1 and X + Y — 2, whose locus of common zeros 
is empty. To get these lines to intersect, we have to introduce “points at infinity.” 
The projective plane is the device for including such points. 

Let K bea field, and let Kj, be an algebraic closure. The projective plane 
over K is defined set theoretically as the quotient of K an — {0} by an equivalence 
relation: 

P= Prue = {(@, y, w) € Ka, — {0}} / ~, 


where (x’, y’, w’) ~ (x, y, w) if (x’, y’, w’) = A(x, y, w) for some A € Kae 
The set of K rational points, or K points, of P” is the quotient 


Pe = {(x, y, w) € K* — {0}}/ ~, 


3. Projective Plane Curves 457 


where (x’, y’,w’) ~ (x, y, w) if (x’, y’, w’) = A(x, y, w) for some A € K. 
When there is a need to be careful, we shall write [x, y, w] for the member of Pe 
corresponding to (x, y, w) in K* — {0}. But often there will not be such a need, 
and we shall simply refer to (x, y, w) as a member of P%. Both P? and P7, have 
additional structure on them, given by “affine local coordinates,” and we come to 
that matter later in this section. 

Let us record briefly the obvious generalization of the projective plane to other 
dimensions: Projective n-space over K is defined set theoretically as the quotient 


PY = Pk, = {Q1,..-,%n41) € Kaz" — (0}}/ ~, 


where (xj, eee aay ae © sere: eer ml i © pares i) = A(X, ..., X41) for 
some A € K alge The set P% of K rational points of P” is the set defined in similar 


alg 


fashion using just nonzero vectors in K"*! and scalars in K*. 

Scalar-valued functions on P% are of little interest because they amount to 
scalar-valued functions of K” — {0} that are unchanged when (x1,...,%X,) is 
replaced by a multiple of itself. A polynomial of this kind, for example, is 
necessarily constant. Instead, the polynomials of interest that are related to P/, are 
“homogeneous polynomials.” A monomial in K[X,,..., X,+41] 1s a polynomial 
of the form X Sh vee xo its total degree is eer, ji. We say that a nonzero 
F in K[X,,..., Xn41] is homogeneous of degree d > 0 if every monomial 
appearing in F' with nonzero coefficient has total degree d. By convention the 0 
polynomial is homogeneous of every degree. We write K[X1,..., Xn+i]a for 
the set of homogeneous polynomials of degree d. Each such F satisfies 


F(Ax1,...5A%n41) = ATF (x1, ...5 Xn41) 


for all (41, ...,Xn41) € Kandi e K*. Conversely the fact that the mapping 
of polynomials into polynomial functions is one-one for an infinite field implies 
that homogeneous polynomials over an infinite field can be detected by this 
property. 

Let us assemble some further properties of homogeneous polynomials: The 
monomials of total degree d forma K basis of the vector space K[X1,..., Xn+ila3 
this fact follows from the definition of polynomials over K. To calculate the 
dimension of K[X1,..., Xn+1]a, consider the problem of taking d factors X on 
which to place subscripts and using n dividers to separate the X;’s from the X2’s 
and so on. The number of monomials in question is just the number of ways of 
selecting the n dividers from among the d + n symbols and dividers. Thus we 
obtain the important formula 


‘ d+n 
dimg K[X1,..., Xnsila = ; ‘i 


Lemma 8.3. Any polynomial factor of a homogeneous polynomial over a field 
K is homogeneous. 
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PROOF. Write F = F) F2 nontrivially. Let d; and e; be the highest and lowest 
total degrees of terms in F), and let dz and e2 be the highest and lowest total 
degrees of terms in F7. The product of the terms of total degree d; in F, and 
the terms of total degree d, in Fy is nonzero and is the d,d) total-degree part 
of F. The product of the terms of total degree e; in F, and the terms of total 
degree e2 in Fy is nonzero and is the e,é total-degree part of F. Since F is 
homogeneous, dd) = e;é2. It follows that d; = e; and dz = e; thus F; and F 
are homogeneous. 


An ideal J in K[X1,..., Xn41] is called a homogeneous ideal if it is the sum 
over d > 0 of its intersections with K[X1,..., Xn+1]a: 
(oe) 
T= QUNKIX,..., Xnsila). 
d=0 
The sum is to be regarded as a direct sum of vector spaces. For such an ideal, we 
can compute the quotient K[X1,..., Xn41]// term by term: 


CO 
K[XujesssXngil/E= QD KM, «+s Xattla/ OK May veg Xeatla) 
d=0 
We can often recognize a homogeneous ideal from its generators: an ideal with a 
set of generators that are all homogeneous is necessarily a homogeneous ideal. In 
fact, if an ideal J has homogeneous generators F;, then the most general member 
of J is a finite sum of terms A; F;. The terms of total degree d in A; Fj; are the 
product of F; with the terms in ‘4 of total degree d — deg Fj, and eich sneh term 
is in J. Hence each member of i is a sum of homogeneous polynomials that lie 
in J, and the assertion follows. 

In the setting of P*, projective plane curves over K are initially defined to be 
nonconstant homogeneous polynomials in K [X, Y, W]. Although such polyno- 
mials are not well defined on the projective plane, their zero loci are well defined 
subsets of P?. As in the affine case, the particular polynomial that defines a curve 
is important, not just the zero locus, but two curves are to be regarded as the same 
if each is a nonzero multiple of the other. We can incorporate this viewpoint into 
algebraic language by defining a projective plane curve of degree d > 0 over 
the field K to be any nonzero proper principal ideal in KX, Y, W] generated by a 
homogeneous polynomial of degree d. Such an ideal is necessarily homogeneous. 
In the special cases that d = 1, 2,3, or 4, the curve is called a projective line, 
conic, cubic, or quartic respectively. 

Just as in the affine case, in practice in studying projective plane curves, 
there is often no need to distinguish between a homogeneous polynomial and 
the homogeneous principal ideal that it generates, and we shall feel free to refer 
to a projective plane curve C = (F) C K[X,Y,W] as F when there is no 
possibility of confusion. 
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If (F) is a projective plane curve of degree d, then its zero locus is denoted by 
V((F)) = Vi, ((F)) = {lx yw] € P?| F(x, y, w) =O}. 


The locus 
Vi ((F)) = {[x, y, w] € Px | Fx, y, w) = 0} 


is called the set of K rational points, or K points, of the curve. When we allow 
ourselves to refer to the curve simply as F,, then we can write V (F’) in place of 
V((F)). 


The affine plane A. = {(x, y)} has a standard one-one embedding into the 
projective plane P%. Namely we map (x, y) into [x, y, 1]. The set that is missed 
by the image is the set with w = 0, which is the set of K rational points of the line 
L with L(X, Y, W) = W, a line called the line at infinity. We shall denote this 
line by W. The points of Vx (W),i.e., those with w = 0, are called the points at 
infinity. 

Except for the line at infinity, lines in P% correspond under restriction exactly 
to lines in K*. Namely the projective line L(X,Y¥,W) = aX + bY +cW 
corresponds to the affine line /(x, y) = aX + bY +c, and vice versa. In certain 
ways the geometry of P%, is simpler than the geometry of Az: 


(i) Two distinct lines in P%, intersect in a unique point. In fact, we set up the 
system of equations 


(3 # )()-(): 


Since the lines are distinct, the coefficient matrix has rank 2. Thus 
the kernel has dimension 1, and there is just one point [x, y, w] in the 
intersection. 

(ii) Two distinct points in P% lie on a unique line. In fact, we set up the 
system of equations 


and argue in similar fashion. 


Along with the embedding of AZ into Pe is a correspondence between pro- 
jective curves and affine curves. Let us work with the polynomials themselves, 
without identifying each polynomial with every nonzero scalar multiple of itself. 
The passage from a nonzero homogeneous polynomial F(X, Y, W) of degree 
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d > Oto apolynomial f(X, Y) is given by f(X, Y) = F(X, Y, 1). The mapping 
F +> f is a substitution homomorphism, and it therefore respects products. 
However, the degree may drop in the process, and in particular f(X, Y) is a 
constant if and only if F(X, Y, W) is a multiple of W%. 

In the reverse direction if f(X, Y) is a polynomial of degree e, then f(X, Y) 
arises from a polynomial F(X, Y, W), but we have to specify the degree d of F 
and we must have d > e. Operationally we obtain F by inserting a power of W 
into each term of f to make the total degree of the term become d. For example, 
with f(X, Y) = Y? + XY + X? if the desired degree is 3, then F(X, Y, W) = 
Y27W+XYW-+X°. Onthe other hand, if the desired degree is 4, then F(X, Y, W) 
= Y°W*+ XYW*+ XW. 

The formula for this reverse process is F(X, Y,W) = W7 f(XW7!, YW-!). 
That is, F is given by a substitution homomorphism, followed by multiplication 
by a power of W. From this fact, we can read off conclusions of the following 
kind: 

If polynomials f(X, Y) and g(X, Y) are obtained from homoge- 
neous polynomials F(X, Y, W) and G(X, Y, W) by taking W = 1, 
then there exist integers r and s such that the polynomial 
W' F(X, Y,W) + W°G(X, Y, W) is homogeneous and such that 
tS (X,Y) + g(X, Y) is obtained from it by taking W = 1. 


As we mentioned above, P7, has more structure than simply the structure of 
a set. About any point in P% we can introduce various systems of “affine local 
coordinates.” The idea is to imitate what happens in the definition of a manifold: 
the whole manifold is covered by charts, each giving an invertible mapping of a 
set in the manifold to an open subset of Euclidean space. Here a single system 
of affine local coordinates plays the role of a chart; it puts AZ into one-one 
correspondence with the complement of the zero locus of a line in PX. 

Let ® be a member of the matrix group GL(3, K). Then © maps the set K* 
of column vectors in one-one fashion onto K? and passes to a one-one map of 
P% onto Py called the projective transformation corresponding to ®. Two &’s 
give the same map of P%, if and only if they are multiples of one another. The 
group action of GL(3, K) on IP is transitive because GL(3, K) acts transitively 
on K* — {(0, 0, 0)}. 

If L is the projective line whose coefficients are given by the row vector 
(a b c) and if ® is is in GL(@, K), then the row vector (a b c)®7! 
defines a new projective line L®, and the K rational points of L® are given by 


Vg (L®) = ®(Vx(L)). 


t; 


In fact, let (>) be in Vx(L). Then @ = ® (>) is in ®(Vx(L)) and 


satisfies 
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x! 
(a b oat(y) =o 
w’ 


, 


hence itisin Vx (L®). Conversely if ( y! ) isin Vx (L®), then ( ) =o! ( ") 
x oa 
(a b o(s)=c oat(y)=o 
w w’ 


and thus ( y’ ) is ® of something in Vx (L). 


, 


satisfies 


Ww 
To form the analog of a chart, fix [xo, yo, Wo] in Pe Choose (by transitivity) 


some ® in GL(3, K) with ®(xo, yo, Wo) = (0, 0, 1). Then we can define affine 
local coordinates on ®~!(K x K x {1}) to K? by the one-one map 


g(® (x,y, 1D) =, y). 


This definition generalizes the standard embedding of the affine plane K” into 
P%, earlier; that embedding was the case ® = 1. 


EXAMPLES OF AFFINE LOCAL COORDINATES FOR P%,. 


10 —Xo 
(1) Suppose (xo, yo, Wo) = (Xo, yo, 1). We can choose ® = (« 1 = Then 
00 1 


x 1 0 —xp x xX — Xo 
1 0 O 1 1 1 
In this case, the local coordinates are defined on 


®'Kx Kxl)=KxKx1 


and are given by 


g(x, y,1) = 9(® '(@(, y, D)) 
= o(@"!(x — x0, y — yo, 1)) = (x — x0, y — yo). 


This ® is handy for reducing behavior about (xo, yo, 1) in PS to behavior about 
(0, 0) in K?. 
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001 
(2) Suppose (Xo, yo, Wo) = (0, 1, 0). We can choose ® = (: 0 ). Then 
010 
x 0 0 1 x w 
w 0 1 0 w 1 


g(x, 1, w) = o(®'(@(x, 1, w))) = o(@ (ww, x, 1) = (w, x). 


and 


This ® is handy for studying behavior near one of the points at infinity in PZ. 


We can use affine local coordinates to examine the behavior of a projective 
plane curve “near a particular point,” by which is meant “with that point as the 
center point in the analysis.’ To examine behavior near (0,0, 1), we use the 
correspondence f(X, Y) = F(X, Y, 1) that we discussed earlier. For a general 
point, we make use of the fact that whenever F is a homogeneous polynomial of 
degree d, then sois F o®~!. To examine the behavior of F near a point (xo, yo, Wo) 
in K> — {(0,0,0)}, we choose ® in GL(3, K) with ®(xo, yo, wo) = (0,0, 1), 
and we define 

f (X,Y) = F(®71(X,Y, 1)). 


Under this correspondence the behavior of F at (x0, yo, wo) is reflected in the 
behavior of f at (0,0). We call f(X, Y) the local expression for F in the affine 
local coordinates determined by ®. This local expression is a polynomial in 
K[X, Y], and it is nonconstant unless F is a scalar multiple of (W o ©)? for 
some d. 


EXAMPLES, CONTINUED. 


10 —X0 
(1) Suppose that (x9, Yo, Wo) = (Xo, Yo, 1) and that 6 = (« 1 “ Compu- 


00 1 
; x x + Xo 
®'{y}=lytyo]- 
1 1 


and the corresponding local expression for a projective plane curve F is 


tation gives 


F(X, Y) = F(X +x, ¥ + yo, 1). 
For the projective plane curve 
F(X,Y,W)=X°*Y+ XYW+w 


and the same ©, the local expression f (X, Y) splits into homogeneous terms as 
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F(X, Y) = (xdy0 + xoyo +1) + (XAY + 2xoyoX + x0¥ + yoX) 
ply ke Hox Y 4 AY AE OCY); 


We shall use this splitting in the next section in the first example of intersection 
multiplicity. 


001 
(2) Suppose that (xo, yo, Wo) = (0, 1,0) and that 6 = (: 0 ). Then 
010 


and the local expression for a projective plane curve F relative to this ® is 
f (X,Y) = F(Y, 1, X). 
For the same projective plane curve F as in Example 1, namely 
F(X,Y,W) = X*Y+XYW+W’, 


we obtain 
f (X,Y) = (V7 + XY) + (X?). 


We shall examine this example further in the next section. 


In this way we have associated to each projective plane curve F and to the 
system of affine local coordinates determined by a member ® of GL(3, K’) a local 
expression that is a nonzero polynomial in K[X, Y]. Conversely if the degree 
d and the member ® of GL(3, K) are given and if f in K[X, Y] is nonzero of 
degree at most d, then we can reconstruct a projective plane curve F of degree 
d whose local expression relative to ® is f. We have only to form the unique 
homogeneous polynomial G of degree d with f(X, Y) = G(X, Y, 1) and then 
putF =Go®. 


With these preparations in place, we return to a consideration of resultants and 
Bezout’s Theorem. Our objective is to rephrase Theorem 8.2 to take advantage 
of properties of the projective plane. 


Lemma 8.4. Let K bea field, let A be the polynomial ring A = K[x,,..., x;], 
and let f and g be members of A[X] of the form 


FO) = fon sie ee aks 
8(X) = got giX +--+ + BnX", 
where f; is a member of A homogeneous of degree m’ — j and g; is a member of 


A homogeneous of degree n’ — j. Then the resultant R(f, g) is a homogeneous 
member of A of degree mn’ + m'n — mn. 
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REMARKS. In the application to proving Theorem 8.5, we will have m’ = m 
and n’ = n, and then R(f, g) is homogeneous of degree mn. Problem 8 at the 
end of the chapter concerns a situation for which m’ 4 m and n' # n. 


PROOF. There is no loss of generality in assuming that K is algebraically closed, 
hence in particular is infinite. Each nonzero entry R(f, g)i; of RU, g) is a coeffi- 
cient of f or of g. For each entry, define p(i, j) such that R(f, g)ij(tx1, ..., txr) 


= PIR(f, g)ij)(X1,---,X,). The assembled matrix R with powers of ¢ in place 
is 
t” fo aes 2 ee pm t, 
0 t” fo 
: . (*) 
Geos Seay ee «aE 
0 t” go 


It turns out that there is a function g(i) such that r(j) = q(i) + p@, j) depends 
only on j. Here #7 is the i™ entry of 


(t” yr? -! pent, ” yn 1 


peg 


I 
= 1 
; rt” m+ ). 


pee 


The matrix («) with 1? multiplying every entry of the i row is 


1° 7" fo i Saar a pi gi —m = 
0) tr lym fo tas 
; (7) 
fe 1" BG aes htk ph ph Pigs, 
0 t” —Iyn 20 
In (x), 1° is the j® entry of (¢"t", emt! 2. gt'tn’—-m—"t1)_ Then we 


have 
t"R(f, gtx, sey txX,) = t’R(f, gr, tee Xp), 


where u = )0, q(i) and v = vr). So 


R(f, g)(tx1,...,tx,) =t" “RG, g) (a1, ..., X7)- 


In other words, R(f, g) is ahomogeneous function. Since K is infinite, R(f, g) 
is homogeneous as a member of A. Computing u and v, we find that u = 
mm'+nn' sm(m 1) 5n(n 1) and v = (m+n)(m'+n’) $(m tn)(m-+n—1). 
Therefore v — u = mn’ + m'n — mn, and the degree of homogeneity of R(f, g) 
is mn’ + m'n — mn. 
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Theorem 8.5 (Bezout’s Theorem). Let K be a field, let Kyjg be an algebraic 
closure, and suppose that F in K[X, Y, W],, and Gin K[X, Y, W], are projective 
plane curves. Then their locus V(F) M V(G) of common zeros in Prue is 
nonempty. If this zero locus has more than mn points, then F and G have as a 
common factor some homogeneous polynomial in (X, Y, W) of positive degree. 


REMARKS. Fortwo polynomials f(X, Y) and g(X, Y) inaffine space, applica- 
tion of Theorem 8.1 concerning the resultant in the Y variable involves checking 
that at least one of the polynomials has the expected degree in the Y variable, and 
doing so may not be so easy. In the projective setting, this problem disappears 
if we apply a projective transformation and arrange that [0, 0, 1] not be on the 
zero locus of one of the given polynomials, say F(X, Y, W). In fact, if F is in 
K[X, Y, W],,, then the coefficient of W” has to be a constant, and this term is 
the only term of F that contributes to the value of F at (0,0, 1). With the above 
adjustment the coefficient must be nonzero, and Theorem 8.1 is applicable. 


PROOF. Without loss of generality, we may assume throughout that K is 
algebraically closed. Write F and G in the form 


F(X,Y,W) = fot fiWt---+ fnw™ with fj € K[X, Y]m—j, 
G(X, Y,W)=g0+eiWt+---+8,W" with gj € K[X,Y],—;. 


(*) 


Pick a point (x, y, w) at which F is nonzero, and move it to (0, 0, 1) by aprojective 
transformation, so that F (0,0, 1) 4 0. Regarding F and G as polynomials in W, 
with coefficients in A = K[X, Y], we form R(F, G), which Lemma 8.4 identifies 
as amember of K[X, Y Jinn. 

Since R(F’,, G) is homogeneous as a member of K[X, Y] and since K is alge- 
braically closed, we can choose a point (xo, yo) # (0,0) with R(F, G)(xo, yo) 
= 0. Then the resultant of F(x, yo, W) and G(xo0, yo, W) is 0, and Theo- 
rem 8.1 applies because F'(xo, yo, W) has degree m in W. The theorem says 
that these two polynomials in W have a common factor. Since K is alge- 
braically closed, this common factor vanishes at some wo, and then we must 
have F (xo, yo, Wo) = G(X, Yo, Wo) = 0. This proves the first conclusion. 

For the second conclusion, suppose that V (Ff) V(G) contains mn + 1 points. 
Join these points by lines, and pick a point of P%, that is not on any of the lines. 
We can do so because K, being algebraically closed, is infinite. Applying a 
projective transformation, we may assume that the point is [0, 0, 1]. Write F and 
G in the form (*). Regarding F and G as polynomials in W, with coefficients in 
A= K[X, Y], we again form R(F, G), which Lemma 8.4 identifies as a member 
of K[X, Y]mn. For fixed (xo, yo), Theorem 8.1 says that R(F, G)(xo, yo) = 0 if 
and only if F (xo, yo, W) and G(xo, yo, W) have a common factor (necessarily a 
common factor of the form W — wo because K is algebraically closed), if and 
only if F (xo, Yo, Wo) = G(Xo, yo, Wo) = 0 for some wo. So at each of our mn +1 
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points, say (x;, yi, wi), we have R(F, G)(cx;, cy;) = 0 for all scalars c. Since 
(xi, yi) # (0,0), RCF, G) vanishes on the line y;X — x;Y = 0. Consequently 
y;X — x;Y divides R(F, G) in K[X, Y]. 

Suppose that (x;, y;) is a multiple of (x;, y;) withi # j. Then (x;, y;, w;) and 
(xj, yj, w,) both satisfy y; X —x;Y = 0. Since (0, 0, 1) satisfies this also and since 
(0, 0, 1) is not to be on any of the connecting lines, we obtain a contradiction. 

Thus the mn-+ 1 factors y; X —x; Y are nonassociate primes in K [X, Y] dividing 
R(F, G). By unique factorization for K[X, Y], their product divides R(F, G). 
Since deg R(F, G) = mn, we conclude that R(F,G) = 0. Then Theorem 
8.1 shows that F and G have a nonconstant common factor in K[X, YJ[W] = 
K[X, Y, W]. The common factor is homogeneous by Lemma 8.3, and the second 
conclusion is proved. 


4. Intersection Multiplicity for a Line with a Curve 


In this section we begin the topic of “intersection multiplicity” for projective plane 
curves. The idea is that the number of points in the intersection V(F') MN V(G) in 
Bezout’s Theorem as formulated in Theorem 8.5 should actually equal mn, not 
merely be bounded above by mn, if the field is algebraically closed and the points 
are counted according to their “multiplicities,” whatever that might mean. 

The prototype is the factorization of a polynomial of degree n in one variable. 
The polynomial has at most n roots, and it has exactly n if the field is algebraically 
closed and each root is counted according to its multiplicity. In this case, as we 
well know, a root zo of f(z) has multiplicity k if (z — zo)* is the largest power of 
Z — Zo that divides f(z). 

Our objective in this section is to develop a notion of intersection multiplicity 
for the case of a line and a curve at a point; the case of two curves is less 
intuitive and is postponed to the next section. The main result is to be that the 
sum of the intersection multiplicities at all points for a line and a projective 
plane curve equals the degree of the curve, provided that the underlying field is 
algebraically closed and that the line does not divide the curve. The statement 
in the previous paragraph about polynomials in one variable will amount to a 
special case; for this special case the projective line is Y, the projective curve is 
of the form W4-!Y — F(X, W), where F is homogeneous of degree d and where 
f(X) = F(X, 1), and the divisibility proviso is that F not be the 0 polynomial, 
i.e., that f(z) not be identically 0. 

Let K be a field, let L be in K[X, Y, W];, and let F be in K[X, Y, W]q. 
The notation for intersection multiplicity will be 1(P, LM F), where P = 
(Xo, Yo, Wo) is in Vx(F) N Vx (L). To make the definition, we introduce affine 
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local coordinates. Choose ® in GL(3, K) with ®(xo, yo, wo) = (0,0, 1), and 
form the corresponding local expressions 

f (X,Y) = F(®7'(X,¥, 1) = fi(X Y) +++ + falX,¥), 

MX Y) = 1 KF 1), 


Here fj is the part of f that is homogeneous of degree j. Since /(0,0) = 0, 
we see that 1(X, Y) = bX — aY for some constants a and b not both 0. Then 


g(t) = (2); for t € K, is a parametrization of the locus in Ay on which 
I(x, y) = 0. The composition f (g(¢)) is a polynomial in t with f(g@(O)) = 0. In 


fact, 
f(t) = fila, bt) + fo(at, bt) +--+ + fa(at, bt) 
= tf\(a,b) +0? fola,b) +++» +14 fala, b). 


There are two possibilities. If f o g is not the 0 polynomial, then f(¢(t)) 
has a zero of some finite order at f = O, and this order is defined to be the 
intersection multiplicity, or intersection number, /(P, LF). If f og is the 
0 polynomial, then we say that /(P, L 1 F) = +00. It will be convenient to 
define /(P, LO F) = Oif P is not in Ve(L)  Ve(F). We need to check that 
I(P, LMF) does not depend on the choice of ®, but we postpone this verification 
until after we consider two examples. 


EXAMPLES OF INTERSECTION MULTIPLICITY. 


(1) Example 1 in the previous section showed that relative to a suitable ® in 
GL@, K), the projective plane curve 


F(X,Y,W)=X°Y + XYW4+w? 
has local expression f(X, Y) about P = (xo, yo, 1) given by 
f (X,Y) = (oy0 + xoyo + 1) + (xGY + 2x0y0X + X0¥ + yoX) 
LA yg Qtek PAX YT E(ReRY) 
= fot fi(X,Y) + fo(X, Y) + (X,Y). 


For a line L, the intersection multiplicity 7(P, LAF) isO unless P lies in Vx (F), 
ie.,unless fo = xAyo + xoyo + 1 = 0. Suppose that the line L is given by 


L(X, Y,W) =aX +BY +yW, 


with local expression 


(X,Y) = L(X + x0, ¥Y + yo, 1) = (exo + Byo + y) + (@X 4+ BY). 


468 VUI. Background for Algebraic Geometry 


Here @ and £ are not both 0. The intersection multiplicity /(P, LF) is 0 unless 
P lies also in Vx (L), 1.¢., unless axo + Byo + y = 0. Thus suppose that P lies 
in Vx(L) Vx (F). Then we can parametrize the locus for which /(x, y) = 0 by 


() =(t)= (“).and we obtain 


fi(gt)) = fil Bt, at) = t(xga — 2xoyoB + xow — yoB), 
flg@)) = fa(—Bt, at) = t?(yoB* — 2xoaB + af). 


One point lying in Vg(F) is P = (%, yo, Ll) = (, —f, 1}; and P lies also 
in Vx(L) if a — 5B +y = 0,ie., if y satisfies y = 5B —a. Then we 
have fi(g(t)) = t(2a + 3B) and fo(y(t)) = #°(—5h? — aB). Consequently, 
I(P,LOF) is > 1ifand only ify = 5B —a. In this case, /(P, LO F) is > 2if 
and only if 2a + 3B = 0,ie.,ifa = —} 8. When both conditions are satisfied, 
we have f2(g(t)) = 1?(—5 6? — ap) = t7(4.B°), and this is not the 0 function 
because under these conditions, 8 = 0 would imply that (a, 8, y) = (0,0, 0); 
hence /(P, LO F) =2. 

(2) Example 2 in the previous section considered the point P = (xo, yo, Wo) = 
(0, 1,0) for the same F, namely F(X, Y,W) = X°Y + XYW+4 W?. This P 
lies in Vx(F). For a suitable ®, the earlier computations showed that the local 
expression for F is 

F(%,Y) = (7 + XY) + (8%). 


The most general line L for which P lies in Vg (L) is aX + yW = 0, and the 
corresponding local expression is 


LX,Y)=L(Y,1,X) =aY+yX. 
We use the parametrization y(t) = (—at, yt) for L and obtain 

fe) =P? —ay) +P (a). 
By inspection we see that /(P, L 1 F) > 2 for all choices of @ and y, and that 
I(P, LO F) > 3 if and only if y =Oory =a. If y = Oory =a, then a 
cannot be 0, and thus /(P, LN F) =3. 


Let us return to the verification that J(P, LM F’) does not depend on the choice 
of ®. Thus suppose that V is another member of GL(G, K) with Y (x0, yo, Wo) = 


(0, 0, 1). Write 
a B O 
Wood! = (> 5 0). 


r s | 
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form the local expressions 


F(X Y) = FOWUX,Y,D) = A(X) +--+ (XY), 
I'(X, Y) = LOW'(X, Y, 1) =X -a’Y, 


and parametrize the locus in AZ with //(x, y) = 0 by 
x his ot OE 


Lemma 8.6. In the above notation, f(X, Y) equals 


We need a lemma. 


GX +sY¥ +1)" | f[(eX + BY,yX + 6Y) 
+ (7X +s¥ 419°? f(eX + BY, yX + 6Y) 
+---+ fi(aX + BY, yX + 6Y), 


and therefore 
fi(X, Y) = fiaX + BY, yX 4+ 6Y). 


PROOF. For the first conclusion, let us justify the following computation: 


F(X, Y) = (FoW!)(Wo@ 1X, Y, 1) 
= (FowW!)(aX + BY, yX + 5Y,rX +sY +1) 


_ 24 ; X+BY  _yX+08Y 
= (Fow (ox r s¥ 4 D(Sanra asVeT 1)) 


7 d gl( aX+BY yX+Y 
=(rX+sY+1) f Gera re sY 


d X+BY X+8Y 
=(X+$s¥ $1 +o + (eR, A) 


= (rX +s¥4+1)* 1 fi(aX + BY, yX +6Y) 
4+ (7X +s¥ + 1)* * XX + PY, yX +6Y) 
+.+++ fi(aXx + BY, yX + 4Y). 


In fact, the first three lines are valid if we make the computation in the field of 
fractions K (X, Y), the fourth line uses the homogeneity of F and a substitution 
homomorphism that evaluates members of K[X, Y, W] at points of K(X, Y, W), 
and the remaining lines use the homogeneity of f/,..., f, and a substitution 
homomorphism that evaluates their arguments at points of K (X, Y). 
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This proves the first conclusion. To derive the second conclusion from it, we 
expand each of the coefficients on the right side and group terms of the same 
degree of homogeneity under (X, Y) +> (AX, AY). The only term whose degree 
of homogeneity is lis f/(aX+AY, yX+6Y) with a coefficient 1 coming from the 
expansion of (r X+sY+1)¢~; all other terms have higher degree of homogeneity. 
When f (X, Y) on the left side is expanded as a sum of homogeneous polynomials, 
the term of degree 1 is f;(X, Y). The second conclusion follows. 


Continuing with the verification that /(P, L 1 F’) does not depend on the 
choice of ®, we apply Lemma 8.6 to L in place of F, and we obtain 


1(X, Y) =I'(aX + BY, yX + 4Y). 
Since 1(X, Y) = bX — aY and I’'(X, Y) = b'X —a’Y, this equation shows that 
b=b'a-a'y and —a=b'p—-a's. 
Putting A = ad — By, we solve for a’ and b’ and obtain 
aa+ Bb= Aad’ and ya+6dbb= Ab’. 
When x = at and y = Dt, we thus have 
ax + By =aat + Bbt =tAd' and yx + dy = yat + dbt =tAbd’. 


Substituting these formulas into the first conclusion of Lemma 8.6 and using the 
homogeneity of each f gives 


f(t) = art + bst +. 1)4 tA @,) 
+ (art + bst +1)? 71? A? f(a’, b') + +17 AT FD’). 


If j is the smallest index for which f(a’, b’) 4 0, then the lowest power of 
t remaining on the right side after expansion of the coefficients is t/, and its 
coefficient is A/ f (a’, b’). Thus we can conclude that the lowest power of t with 
nonzero coefficient on the left side is t/, and its coefficient f(a, 6) must equal 
Aj fj (a’, b'). The equality of the lowest power of tf remaining on each side shows 
that i (P, LQ F) is the same when computed from f as when computed from f’, 
and we obtain as a bonus the formula f(a, b) = Aj fj (a’, b’) if t/ is that power. 
This completes the verification that /(P, LN F) does not depend on the choice 
of ®. 


Now we come back to the circle of ideas around Bezout’s Theorem. The first 
task is to clarify the meaning of infinite intersection multiplicity. 
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Proposition 8.7. Over the field K if a projective line L and a projective plane 
curve F meet at a point P in P?, then /(P, LQ F) = +00 if and only if L 
divides F. 


ProoF. If L divides F,, then in the above notation the local expression /(X, Y) 
divides f(X, Y). Since /(g(¢)) is the 0 polynomial, so is f(g(f)). 

Conversely suppose that f (g(t)) is the 0 polynomial, so that f, (a, b) = 0 for 
all r with 1 <r < d = deg F.. Without loss of generality, suppose b 4 0. The 
equality 


0= f(a, b)= cod’ + cja’—'b +.--+¢,b" 
= b' (co(ab“!)’ + c\(ab“!) 1 +--+ +e) 


says that Z — ab is a factor of b’ (cgZ” +c,Z'~! +--+ ¢,). If we write 
Bi (coZ” +e1Z" | +++) +c) = (Z — ab“ )u(Z) 
and take Z = XY~', then 


ie Ee. Y= BY’ (co(XY7!Y" + cy (xy yr! Boats +c) 
=V OY = Sab Gy Yeah 1G Yr war), 


Hence /(X, Y) divides f,(X, Y) for allr. It follows that /(X, Y) divides f(X, Y) 
and then that L divides F. 


The full-strength version of Bezout’s Theorem says that two projective plane 
curves F and G of degrees m and n meet in at most mn points even when 
multiplicities are counted, and that the number is equal to mn if K is algebraically 
closed and multiplicities are counted. This theorem will be proved in Section 6. 
For the time being, we shall limit ourselves to the special case of the full-strength 
theorem in which one of the curves is a line. 


Theorem 8.8 (Bezout’s Theorem). Let K be an algebraically closed field. If 
F is a projective plane curve over K of degree d and if L is a projective line such 
that L does not divide F, then >, 1(P, LN F) =d. 


PROOF. First we show that 


y-1(P, LN F) < +00. (x) 
P 


Since L is assumed not to divide F’, Proposition 8.7 shows that /(P,L 9 F) 
is finite at every point of Vx(L) N Vx (F). Thus >>, /(P,L 2 F) is finite if 
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there are only finitely many points in Vx (L)N Vx (F). Bezout’s Theorem in the 
form of Theorem 8.5 shows that either Vx(L) NM Vx (F) is finite or else L and 
F have as a common factor some homogeneous polynomial of positive degree. 
Since L has degree 1, L is prime, and thus L and F can have a common factor of 
positive degree only if L divides F. We are assuming the contrary, and therefore 
Vx (L)N Vx (F) is finite. This proves (). 

Possibly by applying a projective transformation, we may assume’ that the 
given line L is the line at infinity W. Then the points P; with ](P), WO F) > 0 
are of the form [x;, y;, 0]. Taking into account that the algebraically closed field 
K is necessarily infinite, we can apply a second projective transformation, one 
that translates the Y variable, and assume that no y; is 0. Then we can write 
P; =[r;, 1,0] with r; in K. Let us see that 


H(X) = F(X,1,0) isa nonzero polynomial of degree exactlyd. — (**) 


In fact, F(X, Y, W) is homogeneous of degree d, and we have arranged that 
[1, 0, 0], which certainly lies in Vx (W), is not in Vx (F). Consequently the X@ 
term in F(X, Y, W) has nonzero coefficient, and (*«) follows. 

Next let us prove that 


I((r, 1,0), WM F) = multiplicity of r as a root of H(X) = F(X, 1,0). (4) 


Then it will follow that 5°, 7(P,W  F) equals the number of roots of 
H(X) = F(X, 1,0), each counted as many times as its multiplicity. In view 
of (>) and the fact that K is algebraically closed, we will then have proved that 
dp l(P, WO F) = d, as required. 
To prove (+), we introduce affine local coordinates about (r, 1, 0), using ol! = 
10r 


001 }], so that ®(r, 1,0) = (0,0, 1). The local versions f of F and / of W 


010 
relative to this ® are 
f(%,Y) = F(@1(X,Y, )) = F(X 4+7,1,¥), 
l(xX,Y)= W(o!(X, Y,1))=Y. 


Hence /(X, Y) is of the form bX —aY witha = —1 andb = 0. If we parametrize 
I by g(t) = (at, bt) = (1, 0), then 


f(v(t)) = f(t, 0) = F(-t +r, 1,0). 


If P and P’ are distinct points in P7,, then there exists a projective transformation carrying P 
to [1,0,0] and P’ to [0, 1,0]. This transformation carries the unique line through P and P’ to the 
line at infinity. 
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The order of vanishing of f(g(t)) at t = 0, which is I([r, 1,0],Wn Py, thus 
equals the order of the zero of F(—t + 7,1,0) at t = 0, which equals the 
multiplicity of r as a root of H(X) = F(X,1,0). This proves (+), and the 
theorem follows. 


5. Intersection Multiplicity for Two Curves 


In this section we continue the topic of “intersection multiplicity” begun in Sec- 
tion 4. That section dealt with intersection multiplicity for the special case of a 
projective line and a projective plane curve, and the present section deals with 
the general case of two projective plane curves. The next section will use the 
general notion to address Bezout’s Theorem in full generality. In this section and 
the next we shall make occasional use of material from Chapter VII, especially 
Lemma 7.21 and the results in Section VII.1. 

It is worth reviewing qualitatively what happened in Section 4. What we 
did was refer the given line and curve to affine space, parametrize the line in a 
natural way, and substitute the parametrization into the formula for the curve to 
obtain a scalar-valued function of one variable. The order of vanishing of the 
resulting scalar-valued function of one variable was defined to be the intersection 
multiplicity. The classical approach® for handling two curves proceeds by trying 
to generalize this construction, in effect parametrizing one curve and substituting 
into the other. The fact that there need be no natural parametrization of either 
of the curves leads to a number of complications, and ultimately the argument 
involves a complicated ring of power series. 

We shall follow a somewhat more modern approach” based on localizations. 
The definition is not particularly intuitive, and it is necessary to study some 
examples to see its virtues. We give the definition, show that the definition is 
consistent with the definition in the special case of Section 4, check that the 
definition makes sense in general, state some properties that are useful in making 
computations, work out an example, and then verify the properties. Thus let F 
and G be homogeneous polynomials in (X, Y, W) of respective degrees m and n, 
and let P = [xo, yo, Wo] be a point of the projective plane te over a field K. We 
refer matters back to affine space in the usual way by letting ® be any member 
of GL(3, K) such that ®(x0, yo, wo) = (0, 0, 1). The local expressions from ® 


10 


8 An account appears in Walker, Chapter IV. 

°See Fulton, Chapter 3, for the present section and Fulton, Chapter 5, for the next section. 

!0For a still more modern and more general approach, see Serre’s Algébre Locale. Serre’s opening 
sentence summarizes matters by saying, “Intersection multiplicities in algebraic geometry are equal 
to certain ‘Euler—Poincaré characteristics’ formed by means of the Tor functors of Cartan—Eilenberg.” 
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about (0, 0) corresponding to F and G are the polynomials f and g with 


{EDS FOE, ¥)), 
g(x, Y)= G(o|(X, Y,1)). 


These polynomials break into homogeneous parts as 


F(X, Y) = fot fi(X, Y) +--+ + fn (X,Y), 
(X,Y) = got si(X, VY) +--+ + 8n(X, VY), 


with f; and g; homogeneous of degree j in the pair (X, Y). We assume that P 
lies on the locus Vg (F')M Vx (G) of common zeros of F' and G, and the condition 
for this to happen is that fo = go = 0. The order of vanishing m p(F) of F at 
P is the first 7 for which f; is not the zero polynomial; we saw as a consequence 
of Lemma 8.6 that this quantity is well defined independently of the choice of ®. 

The intersection multiplicity /(P, F 1G) of F and G at P can be defined in 
either of two equivalent ways. The equivalence of the two definitions will be used 
repeatedly in the discussion and follows from the fact that localization commutes 
with passage to the quotient by an ideal, a fact that was proved as Lemma 7.21. 
One definition is 


I(P, FG) = dimg ((K[X, YI/(S, Browns 


where (KLX, YV/G, )) is the localization at (0,0) of the K algebra 
K[X, Y]/Cf, g). That is, we form the quotient ring of K[X, Y] by the ideal 
generated over K by f and g, localize with respect to the maximal ideal of all 
members of the quotient vanishing at (0, 0), and compute the dimension of this 
localization over K . The other definition is 


I1(P, F NG) = dimg (S“'KLX, Y1/S"'(f, g)), 


where S is the multiplicative system in K[X, Y] consisting of the complement of 
the maximal ideal (X, Y),i.e., consisting of all polynomials that are nonvanishing 
at (0, 0). In either case all elements of the ring being localized have interpretations 
as functions, and the multiplicative system consists of all the functions that are 
nonzero at a certain point. Nevertheless, the matter is a little subtle because 
some members of the multiplicative system in the first case may be zero divisors. 
Here is a lower-dimensional example of that phenomenon that can also serve as 
a guiding example for Theorem 8.12 below. 
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EXAMPLE OF GEOMETRIC LOCALIZATION. R = (K[X]/((X?(X — 1)*))) 7s 
with the subscript indicating localization at 0. Before passage to the localization, 
the quotient OQ = K[X]/((X 2(X — 1)?)) has dimension 4, with a basis consisting 
of the cosets of 1, X, X”, X°. The multiplicative system S§ for localization at 0 
consists of all members of the quotient that are nonzero at 0. The localization as a 
set consists of equivalence classes of pairs (r,s) with r in Q and s in S, two pairs 
(r, s) and (r’, s") being equivalent if t(rs’ —r’s) = 0 for some t in S. Localization 
is aring homomorphism, and we therefore consider the pairs (r, 5) in the class of 
the additive identity. These have t(r1 — Os) = 0 for some t. Then ¢ and r have 
representatives ¢(X) and r(X) in K[X] such that t(X)r(X) = p(X)X?(X — 1)? 
for some p(X). Furthermore, t(0) 4 0. Then X? must divide r(X), and this 
condition is also sufficient for the choice t(X) = (X — 1)*. Thus the members 
X?q(X) of K[X] give 0 in the localization, and the localization is isomorphic to 
the 2-dimensional algebra K [X]/(X oy 


Proposition 8.9 below will show that /(P, F  G) is independent of the func- 
tion ® used to introduce affine local coordinates. Assuming this independence, 
we begin with an example that shows that the definition is consistent with the 
definition in Section 4. 


EXAMPLE | OF INTERSECTION MULTIPLICITY. Case of a line L and a curve 
F homogeneous of degree d. Assuming that P lies in Vg(L) M Vx(F), we 
introduce affine local coordinates by means of a member ® of GL(@G, K) that 
catries a representative of P to (0,0, 1), and we let /(X, Y) and f(X, Y) be 
the corresponding local expressions for L and F. Let f = fi +---+ fa 
be the decomposition of f into its homogeneous parts. Since the intersection 
multiplicity is being assumed to be independent of the choice of ® and since for 
any second point on a line through (0, 0, 1), there exists a ® that fixes (0, 0, 1) 
and carries that second point to (1, 0, 1), we may assume that /(X, Y) = Y. We 
introduce the parametrization (x, y) = g(t) = (t, 0) for the line /(X, Y) and 
substitute into f(X, Y), obtaining f(g(t)) = fi(t,0) +---+ fat, 0). In the 
definition of Section 4, the intersection multiplicity is the least such that f, (¢, 0) 
is not identically 0, or else it is +00 if f((t)) is identically 0. With the new 
definition we observe from the definition of r that f is of the form 


f (X,Y) = (o- X" + +egX4) + Y9(X, Y) =c,X" (1 + XA(X)) + Yo(X, Y) 


with c, #4 0, g(X, Y) € K[X, Y], and h(X) € K[X]. The ideal in K[X, Y] 
generated by Y and f is the same as the ideal generated by Y and X’(1+ Xh(X)). 
Hence 


K(X, YI/(Y, f) = KLX, Y/Y, X" (1 + Xh)) = K[X1/(X" (1 + Xh)). 
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The polynomial 1 + XA(X) takes a nonzero value at 0 and hence is a member of 
the multiplicative system that we use to form the localization. Thus 


(KIX, YI/(, f)) oo) = (KIXI/(X" (+ Xf) o) = (KIXI/(X)) 6. 


The dimension of the right side is r, and thus the new definition of intersection 
multiplicity matches the old one. 


Proposition 8.9. The intersection multiplicity of two projective plane curves 
F and G at P is well defined independently of the member of ® that moves a 
representative of P to (0,0, 1). 


PROOF. It is enough to take P = [0, 0, 1] and to compare the effect of passing 
to affine local coordinates determined by the identity with the effect of passing 
to the coordinates determined by a general element ® of GL(3, K) of the form 

a B 
d= (> 80). Let deg F = m and degG = n. If f(X, Y) = F(X, Y, 1) and 
SS rsl 
f (X,Y) = F(®71(X, Y, 1)), then the computation in the proof of Lemma 8.6 
shows that 


f (X,Y) = +rX +s¥)" f(GGRy. a): (*) 


Similarly if g(X, Y) = G(X, Y, 1) and 2(X, Y) = G(®~|(X, Y, 1)), then 


= ~( aX+BY X+8Y 
g(X, Y= +rx + sy)" a( iar Xts¥ : iexasy) 


Let 


, Bp’ 0 
_ _aX+BY _ _yX+8Y oe Seer Gare 
x= 1 = and t= (yr). 
r s 
It is purely a formal matter that the mapping T defined by (Th)(X, Y) = h(x’, Y') 
is a field isomorphism of K(X, Y) onto K(X’, Y’). It sends K[X, Y] onto 
K[X’, Y’]and sends (K[X, YM) 6 9) Onto (K[X’, aay ye 0° Referring to the formu- 
las for X’ and Y’, we see that the image of K[X, Y] is contained in the localization 
(K [X,Y 1) (0,0)? by the universal mapping property of localizations, the image of 
(K[X, Y]) (0,0) 8 contained in (K[X, Y]) Bo 
we see that (K[X’, Y’]) 0.0) © (K[X, Y]) a 
Meanwhile, we can solve the equations defining X’ and Y’ for X and Y. If we 
compare the results with the formula for ®~', we find that 


Comparing these two conclusions, 


ae a’ X’+p'Y’ _ y'X'+6'Y' 
X= 1+r/X’+s/Y’ and Y= 1+r/X’+s/Y'* 
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Thus the situation is symmetric, and we have (KIX, a oS (K[X’, Y’}) 
Consequently the mapping 


(0,0) * 


_ X+BY X+6Y 
(Th)(X, Y) _ 1d Cae er _ pki) 


is an algebra automorphism of (K [X,Y 1) 0,0)" 
To prove the proposition, recall that localization commutes with passage to the 
quotient by an ideal. In view of («), it is therefore enough to show that 


dimg ((KLX, ¥]) 0.9 /(f8)) 


= dimg ((KLX, a TACG. +rX+sY)"Tf,(l+rX +sY)"Tg)). («*) 


The factor (1 +rX + sY) is a unit in (KIX, Y]) and we can simplify the 


quotient algebra on the right side of (+) to 


(0,0)? 


(K[X, Y]) ‘ae /(Tf, Tg). 


In turn, this algebra is K isomorphic to (K[X, Laie a g) because T is an 
automorphism of (K [X,Y 1) 


The dimensional equality in (+) follows. 


(0,0)° 


Let us extend the definition of intersection multiplicity to include the case 
that the point of interest does not lie in the locus of common zeros. We define 
I(P,F AG) = 0 if P is not in Ve(F) MO Ve(G). Assume now that K is 
algebraically closed. Below we compute a fairly typical example of intersection 
multiplicity. To do so, we shall make use of certain properties of 1(P, FM G) 
that we list in Theorem 8.10 below. In fact, there is an algorithm for computing 
I(P, F 1G) using only these properties,!! but we shall not give it. 

Before stating the properties, we need to make some definitions. Recall from 
earlier in the section that the order of vanishing m p(G) of G at P is computed using 
a suitable ® in GL(3, K) to refer G to affine local coordinates about P, defining 
e(X,Y) = G(@ (X,Y, 1)), expanding g(X, Y) as asum of homogeneous terms 
g(X, Y) = gotei(X, Y)+---+en(X, Y), and defining m p(G) to be the least j 
such that g; is not the 0 polynomial. The homogeneous polynomial g;(X, Y) is X/ 
times a polynomial in the one variable Y X~!, and the fact that K is algebraically 
closed implies that g; has a factorization of the form 


gi(X,Y) =c] | @iX + BiY)™ 


‘Fulton, p. 76. 
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with c in K. Here j = )°;mj;, and the pairs (a;, B;) correspond to distinct 
members of P; that are uniquely determined up to indexing if c # 0. Let 
[(X,Y) = a;X + BY, and let L; be the corresponding projective line. We 
refer to all the lines L; as the tangent lines to G at P, and we say that m; is the 
multiplicity of L;. The geometry of the situation is indicated in Problem 12 at 
the end of the chapter. 


Theorem 8.10. Let K be an algebraically closed field, let P be in P2., and let 
F and G be projective plane curves over K. Then the intersection multiplicity 
I(P, F AG) has the following properties: 
(a) [1(P, FANG) =1(P,GNF), 
(b) 1(P, FANG) =1(P, FN (G+ AF)) for any projective plane curve H 
with deg HF = deg G such thatG + HF 40, 
(c) [(P, FANG) > Oif and only if P lies in Vx(F) 9 Vx (G), 
(d) 1(P, FAG) < I(P, AF O BG) for any projective plane curves A and 
B, with equality if A and B are nonvanishing at P, 
(e) [(P, F NG) is finite if and only if F and G have no common factor of 
degree > 1 having P on its zero locus, 
(f) 1(P, FAGH) =1(P, FANG)+1(P, FN A) and consequently if F = 
[], 4’ and G = I]; G; , then I(P,FNG)= vii rjsjl(P, Fi NG;), 
(g) [1(P, FAG) = mp(F)mp(G), with equality if F and G have no tangent 
lines in common at P. 


REMARKS. Properties (a) and (b) are evident. Properties (c) and (d) are 
conversational and will be proved in these remarks. Properties (e), (f), and (g) 
require proofs, and we give those proofs after computing an example. For (c), if 
P lies in Vx (F)N Vx (G), then the local expressions f(X, Y) and g(X, Y) vanish 
at 0, and so does every member of the ideal (f, g); therefore (f, g) is a proper ideal 
in (K [X,Y 1) 0,0)" and the dimension of the quotient is positive. Conversely if P is 
not in Vx (F), say, then f(X, Y) lies in the multiplicative system S of nonvanish- 
ing polynomials at (0, 0), and S~!(f, g) = (1); hence S~'K[X, Y]/S7'(f, g) = 
0,and 1(P, FAG) = 0. For (d), S“'(af, bg) € S~'(f, g) with equality if a and 
bare nonvanishing at (0, 0),and hence S~! K[X, ¥]/S~!(f, g) isahomomorphic 
image of S~'! KX, Y]/S~!(af, bg) and is a one-one homomorphic image if a and 
b are nonvanishing at (0, 0). 


EXAMPLE 2 OF INTERSECTION MULTIPLICITY. Let K = C, and let the two 
projective curves be the homogeneous versions of Y? = X? and Y? = X°. In 
other words, let 


F(X,Y,W)=Y?W-—X? ands G(X, Y, W) = Y?W?- X?. 
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We compute /(P, F M G) for all points P in Vx(F) 1 Vx(G). In the affine 
plane the intersections (x, y) may be found by substituting the one equation into 
the other (or, with more effort in this case, by using the resultant). We obtain 
x> — x3 = 0. This gives x3(x? — 1) = 0. The factor x* — 1 has two distinct 
roots, and each gives two distinct y’s. Thus we obtain the five affine solutions 
(+1, +1), (-1, +i), (0,0). The fact that the first four occurred routinely with 
multiplicity 1 translates into intersection multiplicity 1 foreach: In fact, (b) shows 
that /(P, FAG) =1(P, FN (W2F —G)), and W’F — G restricts at (X, Y, 1) 
to X° — X? = X3(X? — 1). At each of the points (+1, +1), X° — X?* when 
viewed as equal to 0 has a vertical tangent X — 1 of multiplicity 1, while Y* — X? 
has a tangent that is not vertical. A similar argument applies at each of the points 
(—1, +i). By (g), the intersection multiplicity is 1 at each of the four points 
(+1, £1) and (—1, +i). 

Next let us consider (0, 0). The order of X 5 — X3 is 3, and the homogeneous 
term of degree 3, namely —X°, factors as the cube of a linear factor that gives 
the vertical line X. Meanwhile, Y* — X? has order 2 at (0,0), and Y? factors 
as the square of a linear factor that gives the horizontal line Y. The two curves 
have no tangents in common. Hence equality holds in (g), and the intersection 
multiplicity is 6 at (0, 0). 

Finally let us check points (x, y, w) on the line at infinity, i.e., those with 
w = 0. Putting w = 0 in the formula F = G = 0 shows that x = 0. Thus 
the only point of Ve (F) MN Vg (G) on the line at infinity is P = [x0, yo, wo] = 
[0, 1,0]. The local versions of F and G may be given in the variables X and 
W by restricting (X, Y, W) to (X, 1, W) and considering the polynomials about 
(x, w) = (0,0). As above, (b) gives 1(P, FOG) = 1(P, FA (W?F —G)), but 
F = Y*W — X? restricts to W — X3 and W°F — G = —W?X? + X° remains 
unchanged upon restriction. The respective lowest-order terms, in factored form, 
are W and —X3>(X + W)(X — W). None of the factors of the first polynomial 
matches a factor of the second polynomial, and (g) says that the intersection 
multiplicity is 1-5 = 5. 

The upshot is that we get multiplicity 6 from (0, 0), multiplicity 1 apiece from 
four other points in the affine plane, and multiplicity 5 from P = [0, 1,0]. The 
total is 15, the product of the degrees of the given curves, as it must be if we are 
to have any chance of obtaining the desired generalization of Bezout’s Theorem. 


To get at Theorem 8.10, we make use of a structure theorem about ideals / in 
K[X,..., Xn] for which V(/) is a finite set. To prove the structure theorem, 
which appears as Theorem 8.12 below, we first prove a lemma about the radical 
JT of an ideal J, a notion defined in Section VII.1. 


Lemma 8.11. If R is a commutative Noetherian ring and J is an ideal in R, 
then (fT )” CI for some integer m > 1. 
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PROOF. Since R is Noetherian, the ideal /7 is finitely generated. Let 
{a1,..., Qn} bea set of generators for it. By definition of radical, choose integers 
k,,...,K, such that a,’ isin J for 1 < 7 <n,and put m = yt k;. The most 
general element of JJ is of the form ee rja; with allr; in R. The m™ power 


of this element is a sum of terms of the form ra}! -+-aln with ei 1; =m. In 


A : lj 
view of the definition of m, we must have /; > kj for some j. Then the factor a; 


1 . . 
nisin l. 


is in J, and hence the whole term ra}! -@ 


Theorem 8.12. Let K be an algebraically closed field, and let J be an ideal 
in the polynomial ring K[X1,..., Xn] whose locus of common zeros in K” is 
a finite set {Pi,..., Py}. Then K[X1,..., X,]/J is isomorphic as a ring to the 
product of its localizations at the points P;: 


k 


K[X1,..-,Xnl/E =] (KiX1,--. Xnl/D)p)- 
j=l 


Consequently 


k 
dime (K[X1,...,Xnl/D) = dime (K1X1,..-, Xnl/D)p 
j=l . 


REMARKS. The one-variable case is a guide: The ideal / is principal, and we 
can write K[X]/J as K[X)/(Tj-1 (X —c;)""). The points P; of the theorem are 
the members c; of K , and the same argument as for the first example of the section 
shows that (K[X]/([]; (X -—c;)”)) «) = K[X]/(X—c;)™. The isomorphism of 
the theorem therefore reduces to an instance of the Chinese Remainder Theorem. 


PROOF. Let yj : K[X1,...,Xn]/I > (K[X1,..., Xnl/1) (py be the canoni- 
cal homomorphism, and let p = (¢,..., g). The mapping ¢ is a ring homomor- 
phism into ges (K(X), aan Xnl/T) (p> and we shall prove that g is one-one 
onto. Doing so requires some preparation. 

Let J; be the maximal ideal of all polynomials vanishing at P;. The Null- 
stellensatz (Theorem 7.1) shows that JT consists of all f € K[X,Y] such 
that f vanishes at each P,, ie., that /7 = ae I;. Lemma 8.11 shows that 


(/T y@ C I for some m, and thus an 1;)" CI. Fori Aj, 1" + I" is an 
ideal whose locus of common zeros is empty, and the Nullstellensatz shows that 


yd i = K[X,..., X,]. The Chinese Remainder Theorem (Theorem 8.27 
of Basic Algebra) therefore applies and shows that the intersection oz 77" and 
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the product esi T" coincide. Similarly J; + J; = K[X1,..., Xn], and hence 
(ean LS ieee J;. Putting these facts together, we conclude that 


Arata (11) =(N ayer. (*) 


Let us now denote members of K[X,,..., X,] by uppercase letters and their 
cosets modulo J by the corresponding lowercase letters. Let us observe for 
1 <i <k that there exists F; ¢ K[X,,...,X,] with F;(P;) = 4;;. In fact, we 
start from the special case that if P ~ Q, then there exists F with F(P) = 1 
and F(Q) = 0. For the special case, P and Q differ in some coordinate; say that 
xi(P) 4 x;(Q). Then the polynomial 


F(X, ...,Xn) = (Xi — 21) - 21 (Q)y! 
has the required properties. To construct F, with F)(Pj;) = 41;, choose G; 
with G;(P;) = 1 and G,(P;) = 0. Then F, = Ties G; has F\(P,) = 1 and 
F\(P;) =0 for j # 1. The polynomials Fy, ..., F, are constructed similarly. 
With m as in the second paragraph of the proof, fix j and define E; = 


1—(1— F”)". This is divisible by F” and hence lies in I?" ifi ~ j. In 
addition, 1 — Fe lies in J;, and hence | — E; = (1 — Fey i is in I. Therefore 


1-4 EB =(-E)- 4 E; lies in 7". Since the left side is independent 
of j,1 — _, E; lies in fe , 7;", and we conclade from (>) that 


k 
1- DE; lies in J. (>) 


We just saw that ; lies in) ,,; 17". Hence ifi # j,then E; E; lies in pat? < 
I. Passing to cosets modulo ‘a we find from this fact and a («) that 


k 
eje; = 0 fori F j, and that y= (t) 


Multiplying the second equation by e; and substituting from the first equation, 
we obtain 
=e; for all 7. (TT) 


L 


Using (+) and (+7), let us prove for each i that 


to each G € K[X),..., Xn] with G(P;) 4 0 
corresponds a polynomial H withhg =e;. (#) 
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In fact, we may assume that G(P;) = 1. Let Q be the member of J; given by 
Q =1-G. The element Q” £; is in J/” because Q is in Jj, and it is in i 
for j # i because E£; is in I for 7 # i. Thus Q”E£; is in (er I?" Cc I, and 
qe; = 0. Consequently 


ge; t+ qe; +---+q™" 'e:) = (1—qgeil+qt---+q""') =e(1—q™) = 4, 


and H = E;(1+Q0+---+0""!)isa polynomial as in (+). 

Now we can prove that g is one-one. If f is a member of K[X1,..., Xn]/I 
such that p(f) = 0, then g;(f) = O for all i. This means that there exists a 
member g; of the multiplicative system for localization at P; such that g; f = 0. 
Any corresponding polynomial G; has G;(P;) 4 0. By (4), there exists A; with 
higi = e:. Then (+) gives f = ef = i, higif = 0. Thus ¢ is 
one-one. 

For the proof that g is onto, we recall that the multiplicative system used to 
obtain (K[X1, face Xnl/1)(p,) consists of the elements K[X,,..., X,]// that 


are nonzero at P;, and g; carries these to units in (K[X1, a Xnl/T) (p)- Since 
. J 

Ej(P;) = 1, gj(ej) isa unit. For i # j, we have gj(e;)g;(e;) = pj (eie;) = 0, 

and therefore g;(e;) = 0. Consequently 


k 


k 
pj (ej) = d gilen = 9;(e) =90) =1, 


—Al 


~ 


and gj(e;) is the identity of (KIXty5-05 Xnl/D eps: The localization at 
P; consists of the equivalence classes of all pairs (rj, sj) with r; and s; in 
K[X,...,X,]/1 and s; in the multiplicative system for index j. Thus let 
such pairs (r;, s;) be given for 1 < j < k. We are to produce an element a 
of K[X,,..., Xn]/J such that gj(a) = pj (rj) (pj(sj)) for all 7. Use of (4) 
produces h; with hjs; = e; for all j, and this element has the property that 
pj(hj)j (sj) = gj(e;) = 1, hence that y;(hj) = ;(s;)~'!. Consequently the 
element a = )), rjhje; has the property that 


gj (a) = 9;( Lrihiei) = Lg rie hig ei) = Qj (r{ (Qj (Sj) 


and exhibits ¢ as onto. 


Corollary 8.13. Let K be an algebraically closed field, and let J be an ideal 
in the polynomial ring K[X1,..., Xn] whose locus of common zeros in K” is a 
finite set {P,,..., Py}. Then K[X,,..., X,]/J is finite-dimensional, and so is 
the localization (K[X1, bs Sees Xnl/I)(p, for each j. 
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PROOF. This is a corollary partly of the statement of Theorem 8.12 and partly 


of the proof. Let m be as in the proof. If Jo is the maximal ideal (X1,..., Xn) 
of K[X,,..., X,], then J” is the ideal generated by all monomials of degree m, 
and K[X,,..., X,]/Jg" is finite-dimensional. Consequently the maximal ideal 


Ty = (X1 — x1 (Pj), ..., Xn — X,(P;)) has the property that K[X,..., Xy]/1" 
is finite-dimensional. Since [/” + ee = K[X,..., X,] fori # j, the Chinese 
Remainder Theorem shows that 


k k 
KM tial) (IPS TR a, 
j=l j=l 


and the left side is therefore finite-dimensional. By («) in the proof of Theorem 


8.12, mire q" C I, and hence K[X,,..., X,]/J/ is finite-dimensional. Then 


(K [X1,...,XnJ]/1 ) (P,) is finite-dimensional as a consequence of the statement 
J 
of Theorem 8.12. 


PROOF OF THEOREM 8.10e. If F and G have a common factor H of degree 
> 1 such that H(P) = 0, we may assume that H is irreducible. Introduce affine 
local coordinates about P. If f, g, h denote the local versions of F, G, H, then 
the ideal (f, g) of K[X, Y] is contained in the principal ideal (h). The latter 
ideal is proper because h(0, 0) = O, and the irreducibility of H thus implies that 
(A) is prime. If S denotes the multiplicative system in K[X, Y] of polynomials 
that are nonvanishing at (0, 0), then S~!'(f, g) C S~!(h), and we have a natural 
quotient homomorphism of S~!K[X, Y]/S7!(f, g) onto S~'K[X, Y]/S7!(h). 
The latter is isomorphic as a K algebra to (K[X, Y]/(h))o,0), and the dimension 
of this localization is a lower bound for /(P, F 1G). Since K[X, Y]/(h) is an 
integral domain, K[X, Y]/(A) maps one-one into any localization of itself, and 
dimx (K[X, Y]/(A)) is a lower bound for [(P, F NG). Since h is nonconstant, 
either X or Y actually occurs in it, say Y. Then A divides no member of KX], and 
the mapping of K [X] into cosets modulo (h) is one-one. Therefore K[X, Y]/(h) 
contains a subalgebra isomorphic to K[X] and must be infinite-dimensional. 

Conversely if F and G have no common factor of degree > 1 with P on its 
locus, then (d) shows that we may assume F and G to have no common factor of 
degree > 1 of any kind. In this case Theorem 8.5 shows that the locus of common 
zeros of F and G is finite, and Corollary 8.13 shows that J(P, F 1G) is finite. 


PROOF OF THEOREM 8.10f. We are to prove that 
I(P,FOGA)=1(P,FAG)+1(P,F 4A). () 


If F and GH have a common factor of degree > 1 that vanishes at P, then F and 
one of G and H have such a factor. By symmetry we may assume that F and G 
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have that common factor. Then the left side of («) and the first term on the right 
are infinite by (e), and () is verified. 

Thus we may assume that F and GH have no common factor that vanishes 
at P. If F has a prime factor that does not vanish at P, then (d) shows that we 
can drop that factor from all three appearances of F in (+). In other words, it is 
enough to prove (f) under the assumption that F and GH have no common factor 
of degree > 1 of any kind. 

With this assumption in place, introduce affine local coordinates about P, let $ 
denote the multiplicative system in K[X, Y] of polynomials that are nonvanishing 
at (0, 0), and let f, g, h be the local versions of the given curves F,G, H. The 
inclusion of ideals (f, gh) C (f, g) induces an inclusion S“!(f, gh) € S~'(f, g) 
and then an onto algebra homomorphism 


yg: S"'K[X, Y\/S"'(f, gh) > S"'KIX, Y1/S“'(f, 8). 


We shall exhibit a K vector-space isomorphism yw of S —lK[X,Y] /S me f, h) onto 
ker g, and the resulting dimensional equality 


dimg (S~'K[X, Y]/S~'(f, gh) 
= dimg (S~'K[X, Y]/S"'(f, g)) + dimg (S"'KLX, Y]/S"'(f, A) (e*) 


will prove () and hence (f). We define 
Ww: S1K[X, Y] > S7'K[X, YI/S“'(f, gh) 


as a K linear map by W(u) = gu+ S7'(f, gh). Ifaf +bh isin S~'(f, h), then 
Waf + bh) =afg+bgh+S'(f, gh) = S~'(f, gh). Thus W descends toa 
K linear map w of S~'K[X, Y]/S~!(f, A) into S~'K[X, Y]/S“'(f, gh). It is 
evident that pY = 0 and hence that gw = 0,i.e., image y C kerg. 

If any member u+S~'(f, gh) of ker g is given, then0 = g(u+S~'(f, gh)) = 
u + S~'(f, g) shows that u is in S~'(f,g). Say that u = af + bg. Then 
w(b+S"(f,h)) = bgt+S“'(f, gh) = be taft+S"(f, gh) =utS"(f, gh) 
shows that image y > ker yg. Hence image w = ker ¢,i.e., y is onto. 

To see that y is one-one, suppose that w(u + S~'(f, h)) is the 0 coset, ie., 
that gu + S~'(f, gh) = S~'(f, gh). Then gu = af + bgh with u,a,b in 
S-'K[X, Y]. Clearing fractions, we may assume that u,a,b are in K[X, Y]. 
The formula g(u — bh) = af in K[X, Y], in the presence of the assumption that 
F and G have no common factor of degree > 1, implies that f divides u — bh. 
Write u — bh = cf withc in K[X, Y]. Thenu = cf +bh, and u lies in the ideal 
(f, 4). In other words, u + st (f, A) is the trivial coset, and y has been shown 
to be one-one. This proves («*) and hence (f). 
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Lemma 8.14. For any field K , let {L;};>1 be asystem of nonzero homogeneous 
polynomials in K[X, Y] of the form L; = a;X + bjY, let {M;}j>1 be another 
such system with M; = c;X +dj;Y, and suppose that no L; is a scalar multiple of 
some M;. Forn > 1, let Bo, ..., B, be the system of homogeneous polynomials 


By = Ly, -++LyM,--+Mny_x forO <k <n. 


Then {Bo,..., By} 1s a vector-space basis of the space K LX, Y], of all homoge- 
neous polynomials in (X, Y) of degree n. 


PROOF. The set {Bo,..., B,} has n + 1 elements, and n + 1 is the dimension 
of K[X, Y]n because {X”, X”~!Y, ..., ¥Y”} is a basis. Thus it is enough to show 
that {Bo, ..., B,} is linearly independent. If we have a relation paar, cy By = 0 
for scalars c,, then we observe that L, divides each B, for k > 0, and L; does 
not divide Bo because by assumption L does not divide any factor M;. Thus 
co = O. In effect, case n of the lemma has now been reduced to case n — 1, and 
the result readily follows by induction. 


PROOF OF THEOREM 8.10g. Put p = mp(F) and g = mp(G). We pass 
to affine local coordinates about P, letting f and g be the members of KX, Y] 
corresponding to F and G. If J denotes the maximal ideal J = (X, Y)in K[X, Y], 
then f lies in J? and g lies in 17. We form the following sequence of K vector 
spaces and K linear mappings: 


K[X,Y]/I7@K[X,Y]/I? ae K[X,Y]/1?*4 ae K[X,Y1/U?*4+(f, g)) 30. 


Here the mapping ¢ is the algebra homomorphism induced by the inclusion 
TP+d C JP+4 + (f, g), and it is onto K[X, Y]/(1?*4 + (f, g)). The mapping y 
is defined by 

w(atI?,b+1°) =af +bg+1?t4 


and is merely K linear. 
Let us see that the sequence is exact at K[X, Y]/I?*4. Since 


gwla+It,b+ 1°) =f +bg+1?*4) = 1°" + (fg), 


we obtain image y C kerg. If h + /?74 is in kerg, then h is in 1?*4 + (f, g), 
hence is of the formu + af + bg with u in J?™7. Thenh —u = af + bg, and 
Wwlatl?,b+7?)=h—-—ut+1°"4 =h+ 1°", Soimage w D ker g, and we 
have image y = ker. 

The mapping w descends to a one-one linear map of 


M = (K(X, Y]/I7 @ KIX, Y]/I”)/ ker 
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into K[X, Y]/1?*4. The vector space K[X, Y]/I% may be identified with the 
space of all polynomials of degree less than qg, and that space is finite-dimensional. 
Similarly K[X, Y]/J? is finite-dimensional, and therefore 


dimx M = dimgx K[X, Y]/14 +dimg K[X,Y]/1? —dimgkerw.  («) 


Meanwhile, y exhibits K[X, Y]/(1?*4 + (Jf, g)) as isomorphic as a vector space 
to (KX, Y]/1?*4)/M. Consequently 


dimg K[X, Y]/I?t4 = dimg M + dimg K[X, Y]/(?*4 + (f,2)).  (#*) 


~w 


Combining (*) and (+) with the simple vector-space isomorphism K [X, Y]/J d 


K[X, Y, W]ag_; and with the fact from Section 3 that dime K[X,Y, W]g_1 = 
(ae 


4 ) gives 

dimg K[X, Y]/U?*4 + (f, g)) 

= dimg K[X, Y]/1?"™! — dimg K[X, Y]/I4 
— dim K[X, Y]/1? + dimg ker y 


> dimg K[X, Y]/1?*4 — dimg K[X, Y]/I4 — dime K[X, Y]/I” 
Se aa) 
= pq, (7) 


with equality on the fourth line if and only if ker yw = 0. 
The locus of common zeros of 1?t4 + (f, g) is just {0}, and Theorem 8.12 
therefore shows that 


dimx (KLX, Y]/U?*4 + (f, 8))) 9) = dime K[X, YI/U?" + (f,8)). GD 


The inclusion (f, g) € I?t4 + (f, g) induces an algebra homomorphism of 
(KIX, Y1/(F. 8)) 0,9) onto (KIX, YI/U?*4 + (Ff, 8))) 9)» Therefore 


dimg (KIX, YU. 8)) oo) = dime (KIX. YY? + (foo: @ 


Let S be the set-theoretic complement of J = (X, Y) in K[X, Y]. Because of the 
isomorphism S77 KX, VAfs- es = (KX, Y1/J) 0) for any ideal J, equality 
will hold in (£) if S~'(f, g) = S~'(1?+4 + (f, g)). Combining (+), (+1), and 
(+), we find that 

I(P,F MG) = pq, (£5) 


5. Intersection Multiplicity for Two Curves 487 
with equality if 
+1 c§'(f,g) and ~~ wis one-one. (8) 


Inequality (££) completes the proof of the inequality in (g) of the theorem. 
Because equality holds in (#4) if (§) holds, we can complete the proof of all 
of (g) by showing that (§) holds if F and G have no tangent line in common. 

Thus for the remainder of the proof, we assume that F and G have no tangent 
line in common. Let the tangent lines of F, repeated according to their multiplic- 
ities, be L;,..., Lp, and let the tangent lines of G be M,,..., M,. Define L; for 
i > p to be L,, and define M; for j > g to be M,. 

In order to prove that the first conclusion of (§), namely that /?T4 C S$ a f, 2) 
we shall prove that J’ C S~'(f, g) for t sufficiently large, and then we shall prove 
by induction downward on f that J’ C S~!(f, g) as long ast > p+q. If f 
and g were to have a nonconstant common factor, then a tangent line for that 
common factor would be a tangent line for both f and g, and no such tangent 
line exists according to our assumption. Therefore Bezout’s Theorem (Theorem 
8.2) applies to f and g and shows that their locus of common zeros is finite. Let 
it be {(0,0), Qi1,..., Qi}. The third paragraph of the proof of Theorem 8.12 
shows that there exists a polynomial h in K[X, Y] such that h(0,0) = 1 and 
h(Q;) = 0 for 1 <i <1. Then Xh and Yh vanish on {(0, 0), Q1,..., Oy}, and 
the Nullstellensatz (Theorem 7.1) shows that there exists N such that (Xh)% and 
(Yh)% lie in (f, g). Since h is in the multiplicative system S, X% and Y% lie in 
S—'(f, g). Any monomial of degree > 2N contains either a factor X or a factor 
Y™ ,and consequently 17% C S~'(f, g). 

Proceeding inductively downward on ft, suppose that J’ C S~!(f,g) and 
thatt —1 > p+q. As in Lemma 8.14, the polynomials defined by By = 
Ly-++ LM, ---M;-1-« for0<k < t—1 forma vector-space basis of K[X, Y];—1. 
We show that each of these lies in S~!( f, 2); then we can conclude that J’ re 
S~'(f, g), and our induction will be complete. Let f = fo + fost +--+ and 
& = &¢ + 8q41 +++: be the expansions of f and g as sums of homogeneous 
polynomials in (X, Y). If B; is given, then an inequality k > p would imply that 
B, contains a factor L; --- Lp; this is f, up to a constant factor. An inequality 
t —1—k = q would imply that B; contains a factor M, --- M,; this is g, up toa 
constant factor. Sincek < pandt—1—k < q would together imply the inequality 
t—1 < p+ q that we are assuming not to be the case, one of the alternatives 
k > pandt — 1—k > gq must occur. Say the first occurs. Except for a constant 
factor, we then have By = f,C for some homogeneous polynomial C(X, Y) of 
degree t — 1 — p. Substituting for f, gives By = (f — fp41 —--++)C. Each term 
foarC withr > 0 is of degree (p +r) + (t —1— p) > t — 1 and therefore lies in 
I' C S'(f, g). Also, the term fC lies in S~'(f, g). Hence B, lies in S~'(f, g). 
This completes the induction, and we conclude that J?t4 C § =e Sf, 2): 
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In order to prove the second conclusion of (§), namely that w is one-one, 
suppose thatO = Wa+11,b+ 1") =af +bg+1°'4, ie., that all terms of 
af + bg are of order > p+q. Writea = a,+a,4,+--- witha, 4 Oifa is not 
in 17, and write b = by + bs4, +--+ with b, ~ Oif bis not in J”, so that 


af + bg =a, fp + bsgq + (higher-order terms). 


The right side is assumed to be in /?*4, which means that one of the following 
two conditions is satisfied: 

@r+p=st+q<p+qanda, fy +bsg, =9, 

(ii) a; fp is in 1?*4, and byg, is in 177%, 
If (4) holds, then the facts that a, f, = —b;g, and that f and g have no tangent 
lines in common imply that f, divides b,. Since s < p, we must have b, = 0. 
Therefore a, = 0, and the conditions on a, and b, imply that a is in J? and b is 
in J”, which we are trying to show. If (ii) holds, then the fact that a, f, is in 1?*4 
implies that a = 0 orr > q; in either case, a is in /7. Similarly the fact that 
bs%q = 0 implies that b, = 0 or s = p; in either case, b is in J”. We conclude 
that y is one-one, as was to be shown. 


6. General Form of Bezout’s Theorem for Plane Curves 


With the discussion complete concerning intersection multiplicity for general 
projective plane curves, we arrive at the general form of Bezout’s Theorem for 
plane curves. 


Theorem 8.15 (Bezout’s Theorem). Let K be an algebraically closed field, 
and let F and G be projective plane curves over K of respective degrees m and 
n. If F and G have no common factor of positive degree, then 


a I(P, FAG) =mn. 


2 
PeP, 


REMARKS. The sum over P has only finitely many nonzero terms by Theorem 
8.5, and each intersection multiplicity in the sum is finite by Theorem 8.10e. 


PROOF. Theorem 8.5 shows that the locus of common zeros of F and G is a 
finite set. By applying a suitable ® in GL(3, K), we may assume that none of 
these zeros lies on the line at infinity, namely W. To do so, we choose a point P 
not in the finite set of common zeros. There are only finitely many lines passing 
through P and some member of the set of common zeros, and we choose a line 
through P different from all these. If ® is chosen so as to move this line to the 
line at infinity W, then none of the common zeros will lie on the line W. 
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With this normalization in place, let {Pi,..., Px} be the set of common zeros 
of F and G. We introduce local versions f and g of F and G by the definitions 
St (X,Y) = F(X, Y, 1) and g(X, Y) = G(X, Y, 1). Application of Theorem 8.12 
to the ideal J = (f, g) in K[X, Y] gives 


k k 
dim KLX, YI/(f.g) =) dimg (KX, Y1/(f.8)).p) = Vo 1B}, FG). 
j=l j=l 


The theorem will therefore follow if we prove that 
dimg K[X, Y]/(f, g) =mn. (*) 


To prove (*), we shall first prove a related equality concerning K[X, Y, W] and 
the ideal (F’, G) in it, and then we shall use the fact that F and G have no common 
zeros With W to transfer the conclusion to K[X, Y]. 

Define K linear mappings g : K[X, Y, W]® K[X, Y,W] — K[X, Y, W] and 
w:K(LX,Y,W] > K[X,Y,W]@ K[X, Y, W] by 


y(A,B)=AF+BG and wW(C)=(CG,-CF), 


and form the sequence of K vector spaces and K linear maps given by 
0 —> KIX, Y,W] “> KIX, ¥, W1@ KIX, ¥, W]—> K[X,Y,W]. (x) 


It is evident that w is one-one, that pw = 0, and that imagegy = (F,G). If 
(A, B) is inkerg, then AF + BG = 0. Since F and G have no common factor 
of positive degree, F divides B and G divides A. Setting C = AG~' therefore 
gives A = CG and B = —AG~'F = —CF. Hence (A, B) lies in image y. In 
other words, (+) is exact, and image g = (F, G). 

Letd > m+n. If we denote by wg and g, the restrictions of y and ¢ to 
K[X, Y, Wla—-m—n and K[X, Y, W]g-n ® K[X, Y, W]a—m, respectively, and if 
we go over the argument in the previous paragraph, then we see that the sequence 


0 — KIX,Y,Wla-m—-n > KIX,Y,Wlan ® KIXY,Wla-m 22 KIXY,Wla 
is exact and that image gq = (F, G)q. The vector spaces in question here are all 
finite-dimensional, and thus we obtain 
dimx (F, G)a 
= dimg K[X, Y, W]a-n + dimgx K[X, Y, W]a—m — dimx K[X, Y, W]a—m—n 
d—n+2 d—m+2 d +2 
= E+ OO) 


=-nn (9) 


= —mn+dimg K[X, Y, Wa. (+) 
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The ideal (F’, G) is homogeneous, and thus we know from Section 3 that the image 
of K[X, Y, W]a in K[X, Y, W]/(F, G) is K[X, Y, W]a/(F, G)a. If we write 
(KIX, Y,WI/(F, G)), for this quotient, then (+) shows that 


dimg (KLX, Y, W]/(F, G)), = mn Hh 


foralld >m-+n. 

To prove (*) and the theorem, we shall translate (++) into a conclusion about 
K[X, Y]/Cf, g). Fix d > m+n, and let {V; + (F, G),..., Vinn + (F, G)} be 
a K basis of (KLX, Y, W]/(F, G)) )- Define v;(X, Y) = V;(X, Y, 1) for each j. 
We shall prove that the vectors 


vit (Cf, g),---, Umi + Cf, g) (4) 


form a K basis of K[X, Y]/(f, g). 

We need to make use of the fact that Ff and G have no common zeros on the 
line at infinity. Since W(F, G) C (F, G), the K linear mapping of multiplication 
by W on K[X, Y, W] descends to a K linear mapping L of K[X, Y, W]/(F, G) 
to itself defined by L(H + (F, G)) = WH + (F, G). Let us see that 


L: K[X,Y,W]/(F, G) > K[X, Y, W]/(F, G) _ is one-one. (£5) 


In fact, suppose that WH = AF + BG for some H in K[X,Y, W]. For any 
U in K[X, Y, W], let Up(X, Y) = U(X, Y, 0). If U is homogeneous, then so 
is Up. In this notation we can write F = Fo + WM and G = Go + WN for 
homogeneous members M and N of K[X, Y, W]. The polynomials Fo and Go 
are relatively prime: in fact, if Fo and Go have a nontrivial common factor Do, 
then we can regard Dp as a projective plane curve, and it must have a common 
zero Q with W, by Theorem 8.5; but then F’, G,and W have Q as acommon zero, 
in contradiction to the normalization in the first paragraph of the proof. Since 
WH = AF + BG implies Ag Fp = —BoGo, it follows that Fo divides By and 
that Go divides Ap. In other words, Bp = Co Fo and Ag = —CoGo for some Co 
in K[X, Y]. If we define A’ = A+ CoG and B’ = B — CoF, then the formulas 
for Ap and Bo show that Aj = By = 0. Hence A’ = WA” and B’ = WB" 
for some homogeneous polynomials A” and B”. Then WH = AF + BG = 
(A'—CoG)F + (B'+CoF)G = A'F + B'G = W(A"F + B’G), and we obtain 
H = AF + B"G. Thus H lies in (F, G), and (£4) is proved. 

Left multiplication L by W carries K[X, Y,W]a into K[X, Y, W]g41 and 
carries (F, G)g into (F, G)g4,. Therefore L is well defined as a mapping from 
(K[X, Y, W]/(F, G)), into (K[X, Y, W]/(F, G)), 1 Since it is one-one by 
(££) and since the spaces are finite-dimensional, it is onto. Therefore 


{W'V, + (F,G),...,W' Vinn + (F, G)} is a basis (8) 
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of (K[X, Y, WI/(F, G)) a4, for every r > 0. 

To prove that (+) spans K[X, Y]/Cf, g), let h be in K[X,Y]. Let A be 
a homogeneous polynomial in K[X, Y, W] with h(X, Y) = A(X, Y,1), and 
choose an integer s such that W* H lies in K[X, Y, W]a4, for some r > 0. Then 
we can write W°H = Be __,¢jW'V; + AF + BG for suitable scalars c; and 
homogeneous polynomials A and B. Restricting the domain to points (X, Y, 1) 
givesh = D0" cjvj taf +bg, and therefore h + (f, g) = 77", cjuj + (f, g)- 
This proves that (£) spans K[X, Y]/(f, g). 

To prove that (+) is linearly independent, suppose that )~""" j= cj¥j = af + bg 
with a and b in K[X,Y]. If A and B are homogeneous polynomials such 
that a(X, Y) = A(X, Y, 1) and b(X, Y) = B(X,Y, 1), then W” ae cjVj = 
W‘AF + W'BG, provided the exponents r,s, t are chosen to make the de- 
grees of the terms W” ae cjV;, W°AF, and W'BG match. Consequently 
w vin cjV; lies in (F, G)a4,, and (§) shows that the coefficients are all 0. 
This proves that (+) is linearly independent. 


7. Grébner Bases 


The remainder of the chapter returns to the main question introduced in Section 1, 
that of how to get information about the set of simultaneous solutions of polyno- 
mial equations in several variables. The resultant introduced in Section 2 gave us 
one tool, but the tool is of most use when there are only two equations. Beyond 
two equations the number of cases to check quickly grows, and the resultant is of 
limited usefulness.!? 

The tool to be introduced in this section is of a completely different nature. 
Historically it was introduced in order to have a way of deciding whether an ideal 
in K[X1,..., Xn] contains a given polynomial. We know from the Hilbert Basis 
Theorem that every such ideal is finitely generated, and it is assumed that the 
ideal to be tested is specified by such a set of generators. 

The proof of the Hilbert Basis Theorem gives a clue how to start studying an 
ideal of polynomials. In the statement of the theorem, R is a Noetherian integral 
domain, and / is a nonzero ideal in R[X]. It is to be proved that J is finitely 
generated. The proof by Hilbert is longer than the proof given in Basic Algebra, 
but the idea is clearer. To each nonzero member f(X) of 7, we associate the 
coefficient of the highest power of X appearing in f(X). These coefficients, 
together with 0, form an ideal L(/) in R, and L(J) is finitely generated because 
R is Noetherian. Let a,,..., a, be generators, let f;(X),..., f;(X) be members 


The nature of the extended theory can be found in Van der Waerden, Volume II, Chapter XI. 
Theorem 8.31 below in effect reproduces some of this extended theory in acontext that is manageable 
because of the theory of Grobner bases. 
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of J with respective highest coefficients a1, ..., a, and let g be the largest of the 
degrees of fi(X),..., f-(X). Ifa general g(X) in J is given and if a € R is its 
highest coefficient, then we know that a = }°; cja; with c; € R. The polynomial 
h(X) given by h(X) = g(X) — ¥i; Cj fi(X)XPES—8E Fi has degree lower than 
deg g, and g(X) will be in (f|,..., f-) if h(X) is in (fi, ..., f-). Iterating this 
construction, we see that it is enough to account for all the members of J of degree 
<q-—J1. To handle these, one way to proceed is to enlarge the set {f|,..., fr}.a 
little. For each k with 0 < k < gq —1, let L;(/) be the union of {0} and the set of 
coefficients of X* in members of J of degree k. Each of these is an ideal of R and 
hence is finitely generated, and we adjointo{ fi, ..., f,} a finite set of generators 
for each L;(J) with O < k < q —1. The result is a finite set {g),..., gs} of 
generators of J, as one easily checks. 

In fact, the set {g1,..., gs} is a special set of generators. For any member f 
of R[X], let LT(f) be the complete term of f(X) containing the highest power 
of X. What the argument shows is that {g),..., g;} is a subset of 7 such that 
LTV) = (LT(g1), wees LT(g;)), where LT(/) denotes the ideal given as the linear 
span of all polynomials LT(g) for g in J. One can show that this property of 
{g1,-..-, gs} implies that {g1,..., gs} generates 7. In essence this property will 
be the defining property of a “Groébner basis” of J. It is not automatically satisfied 
for just any finite generating set {f1,..., f-}, as the example below shows. We 
shall see that it is easy to use such a set of generators to test any polynomial in R[X] 
for membership in J. Thus the original problem historically for introducing such 
sets is solved except for one little detail: the proof of the Hilbert Basis Theorem is 
not constructive, and we are left with no idea how actually to construct a Grébner 
basis.!? 


EXAMPLE. Treat K[X,Y] as an instance of the above setting by letting 
R = K[Y] and regarding K[X, Y] as R[X]. Consider the ideal J = (fj, fo) 
in R[X] with f\(X,Y) = X* + 2XY? and fo(X, Y) = XY + 2Y¥? — 1. Then 
(LT( fi), LTC f2)) = (X’, XY), and every monomial appearing with nonzero 
coefficient in a member of the latter ideal has total degree at least 2. On the 
other hand, J contains the polynomial 
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and its leading term is X, whose total degree is 1. Thus LT(/) properly contains 


(LT(fi), LT(f2)). 


Because of the nonconstructive nature of the proof of the Hilbert Basis Theo- 
rem, it is necessary to start afresh. One message to glean from the abstract proof 


The exposition in this section and the next three is based partly on the book of Cox—Little— 
O’Shea and the Web tutorial of Fabrizio listed in the Selected References. 
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is that the leading terms of the members of J are important and somewhat control 
the nature of J. To handle K[X,,...,X,] when K is a field, it is of course 
necessary to use an additional induction that enumerates the variables. In the 
example above, we treated X as more significant than Y. For the inductive step 
for general K[X,,..., X,], the ring R in the above argument is K with some 
number m of the indeterminates included, and X is the (m + 1)* indeterminate. 
Putting all the steps of the induction together, we see that the order in which the 
variables are processed appears to be important. 

The theory of Grobner bases as it has evolved allows a healthy extra measure 
of generality. Instead of defining leading terms by insisting on an ordering of the 
indeterminates, it defines them by using a suitable kind of ordering of monomials, 
and that is where we begin. Let K[X,,..., X;] be given, K being a field. Let 
M be the set of all monomials in K[X,,..., X,]. A monomial ordering < on 
M is a total ordering'* with the two additional properties that 

(i) M, < M, implies M, M3 < M>M;3 for all M), M2, M3 inM, 

(ii) 1 < M forall Min M. 
We write M, > M, to mean M, < M). Also, M; < Mz means M, < M2 with 
M, # M2, and M; > Mz means M; > Mp) with M; # Mp. 


EXAMPLES OF MONOMIAL ORDERINGS. Each ordering assumes that the vari- 
ables are enumerated in some way. In these examples we take this enumeration 
to be X;,..., X,. The first four examples all have the property that the largest 
Xj; is X and the smallest is X,,. 


(1) Lexicographic ordering, abbreviated as “lex” by many authors and written 
aS <; py in this list of examples. This, the most important monomial ordering, is 
already suggested by the proof of the Hilbert Basis Theorem. In principle it can 
be used for all purposes in Sections 7-10, but one application in Chapter X will 
require a different monomial ordering. Its disadvantage is that it sometimes makes 
lengthy computations take longer than necessary; this matter will be discussed 
more in Section 9. The definition is that X s i X in < LEX X : ii X jn if either 
the two monomials are equal or else the first k for which i, ~ jy, has ip < jr. 
Thus for example, X; X35 X3 <tpx * i. The word “lexicographic” refers to the 
dictionary system for alphabetizing in which a first word comes before a second 
word if for the first position in which the two words differ, the letter of the first 
word in that position precedes alphabetically the letter of the second word in that 
position. 

(2) Graded lexicographic ordering, abbreviated as “glex” or “grlex” by 
many authors. As in Section 3 the total degree of a monomial X : Xin is 


'4This means a partial ordering with the properties that each pair a, b has a < b or b < a and 
that both hold only if a = b. 
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deg(X}! ng Xin) = Puen ix. The definition of the ordering is that M <GLEXx N 


if either deg M < deg N or else if deg M = deg N and M < tnx N. Thus for 


GLEXx 41% 2x 3 because the total degree 2 of the first monomial is 
less than the total degree 6 of the second monomial. But X iX3X? <GLEXx X ex 7 
because both monomials have the same total degree 6 and the second monomial 
involves a higher power of X, than does the first. This monomial ordering is not 
much used; more common is the variant of it in the next example. 


example, X : < 


(3) Graded reverse lexicographic ordering, abbreviated as “grevlex” by 
many authors. The definition is that M < GREVLEX N if eitherdeg M < deg N or 
else if deg M = deg N and N' <ipx M ‘ where M' is M but with the exponents 
of X; and X,,_; interchanged for each j, and where N’ is defined similarly. This 
ordering takes some getting used to. For example, X x 3 <GREVLEX X Wee. : 


when n = 3 because both monomials have the same total degree and X aX2X = 
(X1X3X3)' <p py (X{X3) = XT XG. By contrast, X1X3X3 <q py X{X3. 
(4) Orderings of k-elimination type, where 1 < k < n— 1. These are 
orderings such that any monomial containing one of X1,..., X, to a positive 
power exceeds any monomial in X;4),..., X, alone. These will be discussed 
in Section 10. Of them, one of particular importance is the Bayer-Stillman 
ordering of k-elimination type. Here a monomial M is < a monomial N if the 
sum of the exponents of X;,..., Xx for M is less than the corresponding sum 
for N or else the two sums are equal and M <gppyjppyx N- This ordering is 
commonly used for making computations in the context of Section 10. 


(5) Ordering from a tuple of weight vectors. For 1 <i <n, let w bea 


vector in R” of the form w = (ww, ..., we), and assume that w,..., w™ 
are linearly independent over R. Identify the monomial X® with the vector of 
individual exponents a = (a1, ..., @,). The ordering given by the weight vectors 


wi? is defined by saying that X* < X* if X* = X°* or if the first i such that 
w)-a Aw. Bhas w”-a < w . B. Here the dot refers to the ordinary dot 
product. A condition is needed on the w’s to ensure that 1 < X® for alla. (See 
Problem 14 at the end of the chapter.) Here are two specific examples for which 
the condition is satisfied. Let e be the i standard basis vector of R”. The 


lexicographic ordering in Example | is determined by the tuple of weight vectors 


(e),...,e™). The Bayer—Stillman ordering in Example 4 is determined by the 
tuple of weight vectors 
(CD 4... +e, eH 4... 40, 2, ,, 2), 2, 0), 


Further discussion of monomial orderings determined by weight vectors occurs 
in Problems 14—15 at the end of the chapter. 


Property (i) of monomial orderings insists that the ordering respect multipli- 
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cation of monomials in the natural way. Property (ii), according to the next 
proposition, is a well-ordering property. The proof of the proposition will be 
preceded by a lemma. 


Proposition 8.16. In any monomial ordering for K[X1, ..., Xn], any decreas- 
ing sequence M; > Mz > M3 > --- is eventually constant. Consequently each 
nonempty subset of has a smallest element in the ordering. 


Lemma 8.17. If / is an ideal in K[X1,..., Xn] generated by monomials and 
if f(X1,..., Xn) is in J, then each monomial appearing in the expansion of f 
with nonzero coefficient lies in 7. Consequently / has a finite set of monomials 
as generators. Moreover, if {M),..., Ms} is a set of monomials that generate / 
and if M is any monomial in J, then some M; divides M. 


PROOF. Let {M,} be the set of monomials that generates /. If f is in J, then 
we can write f = ee hj Ma, for polynomials hj. Let hj = ee cij Mi; be 
the expansion of h; in terms of monomials. If Mo is a monomial appearing in f 
with nonzero coefficient c, then the only possible monomial Mj; in h; that can 
contribute toward c is one with M;;M , = Mo if such a monomial exists. For 
some j, such a monomial must exist, or c would be 0; thus Mp lies in J. 

For the second conclusion, write {f,..., {7} by the Hilbert Basis Theorem. 
The first conclusion shows that each monomial contributing to each f; lies in 
I, and the set of all these monomials, as j varies, is therefore a finite set of 
monomials generating /. 

For the third conclusion, write M = eae a; M; for polynomials a;. Expand- 
ing each a; in terms of monomials, we see that some a; contains with nonzero 
coefficient a monomial M’ such that M = M'M,. The divisibility follows. 


PROOF OF PROPOSITION 8.16. Let M be a monomial, and let J be the linear 
span of all monomials M’ with M’ > M. If M’ is a such a monomial and N is 
any monomial, then NM’ > NM by (i), and NM > 1M = M by (i) and (ii). 
Therefore N M’ lies in J, and J is an ideal. 

From such an ideal J, we can recover M as the unique monomial Mp in J such 
that My < M’ for every monomial M’ in J, since any such My has Mp < M as 
wellas M < Mp. 

With M,, M2,... given as in the proposition, let J, be the linear span of all 
monomials M’ > M;,. We have just seen that J; is an ideal, and the /;’s are 
increasing ink. Then J = Jeo, Jx is an ideal generated by monomials, and 
Lemma 8.17 shows that it has a finite set of monomials as a set of generators. 
Each such monomial generator lies in some /;,. Since the J;’s are nested, all the 
generators lie in some J;,,and we conclude that J = ;,. The previous paragraph 
of the proof shows that /;,, determines M,,, and therefore M, = M,, for all 
k > kp. 
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For the last statement of the proposition, if there were no least element, then for 
any element in the subset, we could always find a smaller element in the subset. 
In this way, we would be able to construct a strictly decreasing infinite sequence 
in M, in contradiction to what has just been proved. 


Fix a monomial ordering for K[X,,..., X;]. If f is any nonzero member of 
K[X,,...,X,] and if f is expanded as a K linear combination of monomials, 
then we define the leading monomial, leading coefficient, and leading term of f 
by 


LM(f) = largest monomial with nonzero coefficient in expansion of f, 
LC(f) = coefficient of LM(f) in f, 
LT(f) = LC(f) LM(f). 


It will be convenient to be able to use these definitions without having to dis- 
tinguish the cases f # 0 and f = 0. Accordingly, let us adjoin 0 to the set 
M, agreeing that0 < M and 0M = 0 for every monomial M. We adopt the 
convention that LM(O) = 0, LT(O) = 0, and LC(O) = 0. 

Since any monomial that occurs in a sum of two polynomials occurs in one or 
the other of them, it is immediate from the definition that 


LM(fi + f2) < max(LM(/1), LM(/2)) 


if fi, fo, and f; + fo are nonzero. Checking the various cases, we see that this 
inequality persists if one or more of f, f2, and fi + fo are 0. 

The comparable results concerning multiplication are contained in the next 
proposition. 


Proposition 8.18. If {| and fo are two nonzero members of K[X1,..., Xn], 
then 


LM(fi f2) = LM(fi) LM(f2) and LC(fi fo) = LC(f1) LC( fo); 


hence 
LT(fi fo) = LT(fi) LT(f2). 


These equalities persist if one or both of f; and f2 are 0. Moreover, if f; and fs 
are nonzero and have LT(f;) = LT(f2), then LM(f; — f2) < LM(/f1). 


PROOF. For the first statement, let the expansions of fi and jf. as linear 
combinations of distinct monomials be fj = a; LM(f1) + )0; c:Mi and fo = 
a2 LM(f2) + 9), dj) Nj with Mj < LM(f1) for alli and Nj; < LM(/2) for all j. 
Then fi f2 equals 


a1 dz LM( fi) LM(f2) + a2 >) ¢; Mj LM(f2) + a1 >) dj LM(f1)Nj + > cid; Mi Nj, 
; j a 
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and the conclusions in the first sentence of the proposition will follow if it is 
shown that M; LM(f2) < LM(f1) LM(/2), that LM(f1)N; < LM(f1) LM(f2), and 
that M; Nj; < LM(f\) LM(f2). The first inequality follows from (i) because Mj < 
LM(f1), and the second inequality is similar. For the third we apply (i) twice to 
obtain M;N; < M; LM(f2) < LM(f;) LM(f2) and observe that the end expressions 
can be equal only if equality holds in both instances. The latter is impossible 
because K[X ,..., X,] is an integral domain, and thus M;N; < LM(f1) LM(f2). 

The three displayed equalities persist if one or both of f; and f2 are 0 because 
LM(f),LT(f), and LC(f) can be 0 only if f = 0. 

Finally if f; and f2 are nonzero and have expansions as in the first paragraph of 
the proof with LT(f1) = LT(f2), thenLC( fi) = a; andLC(f2) = a2. Hence fi— fo 
has an expansion involving only the monomials M; and N;. Consequently if 
fi—f2 A 0, then the largest of the M;’s and N;’sis < LM(f). ThusLM( fi — fo) < 
LM(f,). This inequality holds also if f; — f2 = 0. 


If J is a nonzero ideal in K[X,,..., X,], we define LT(/) to be the vector 
space of all K linear combinations of polynomials LT(f) with f in J. It fol- 
lows from Proposition 8.18 that K[X1,..., X,]LTU) © LTU), and therefore 
LT(/) is an ideal in K[X,...,X,]. A finite unordered subset {g1,..., gx} 
of nonzero elements of the ideal J is called a Grébner basis of J if LIU) = 
(LT(g1), Etioky LT(gx)). The inclusion > follows from the definition, and the 
question is whether LT(g;), ..., LT(g,) generate LT(/). 

Among the examples below, Example 3 is particularly suggestive of the utility 
of a Grobner basis. The idea is that an ordinary set of generators may have 
the property that certain “small” elements of J can be expanded in terms of the 
generators only using “large” coefficients and that this property is reflected in the 
failure of (LT(g1), ..., LT(gz)) to exhaust LT(/). 


EXAMPLES WITH LEXICOGRAPHIC ORDERING. 


(1) Principal ideal. If 7 = (f(X1,..., Xn)), then {f} is a Grobner basis. In 
fact, the most general member of / is of the form Af with h in K[X1,..., Xn], 
and Proposition 8.18 gives LT(Af) = LT(h) LT(f). Therefore LT) = (LT(f)), 
as required. 


(2) Ideal generated by members of K[X,,..., Xn]. Suppose that J = 
(L1,..., Lx), where each L; is ahomogeneous linear polynomial of degree 1. For 


example, J could be (X; + X2 + X3, X; — X3). Let us form the corresponding 


1 


k-by-n coefficient matrix, specifically (| 5) in the 3-variable example. If 


0 
we perform row operations to transform this matrix into reduced row-echelon 
form and let L},..., L;, be the members of K[X1,..., Xn] corresponding to 


the reduced matrix, specifically X; — X3 and X2 + 2X3 for the reduced form 
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O01 2 10- 
Grobner basis of J. This fact is not particularly obvious in the full generality of 


this example, but it will be shown to be an easy consequence of Theorem 8.23 in 
the next section. 


(3) Earlier example in this section. In K[X, Y], let 7 = (fi, fo) with f,(X, Y) 
= X? + 2XY? and fo(X,Y) = XY +2Y* —1. Then (L1(f1), LT(f2)) = 
(X?, XY). We saw that X is a member of / and that LT(X) = X is not in 
(LT(fi), LT(f2)). So {f1, fo} is not a Grobner basis. If we enlarge the set 
of generators of J to {f1, fo, X}, then we still do not have a Grébner basis 
because fp) — YX = 2Y? — 1 is in J and LT(fy — YX) = 2Y° does not lie 
in (LT(fi), LT(f2), LT(X)) = (X*, XY, X) = (X). We can enlarge the set of 
generators still further to { fi, fo, X,2Y 3 _ 1}. Is this a Grébner basis? Here 
we have (LT( fi), LT(f2), LT(X), LT(2Y¥? — 1)) = (X, Y*), and it seems as if this 
equals LT(/). But we need a way of checking easily. We shall obtain a way of 
checking in Theorem 8.23 in the next section. 


Cre, of (1, |)» then J = (L},..., L;,) and moreover {L/,,..., Li, } is a 


The question of existence—uniqueness of a Grobner basis will be addressed 
constructively in Sections 8-9; however, we did observe at the beginning of this 
section that Hilbert’s proof of the Hilbert Basis Theorem essentially handles exis- 
tence when the monomial ordering is the usual lexicographic ordering. Actually, 
the argument at the beginning of the section had two parts to it—a nonconstructive 
argument producing a certain finite set of leading terms and a verification that 
those leading terms lead to a set of generators of the ideal. The first part, being 
a nonconstructive existence proof, does not help us in our current efforts, and 
we defer to Problem 13 at the end of the chapter the question of adapting it to 
a general monomial order. The second part, on the other hand, is a useful kind 
of verification in our current efforts. It shows that a certain kind of finite subset 
of an ideal is necessarily a set of generators, and it generalizes as follows. The 
generalization will play a role in Section 9. 


Proposition 8.19. If K is a field, if a monomial ordering is specified for 
K[X,,..., Xn], and if {g),..., g,} is a Grobner basis for a nonzero ideal J of 
K[X,,..., Xz], then {g1,..., g,} generates J. 


PROOF. First we prove that if f 4 0 is in J, then there exist a g;, a monomial 
Mo, and anonzero scalar c such that LM( f —cMog;) < LM(f). Tosee this, we use 
the hypothesis that {g1, ..., g,} 1s a Grobner basis to find polynomials hj, ..., hx 
such that LM(f) = ae, h; LM(g;). Then it must be true for 7 equal to some 
index j that LM(f) = Mo LM(g;) for one of the monomials Mp that appears in 
h, with nonzero coefficient. Since Mo LM(g;) = LM(Mo) LM(g;) = LM(Mogj), 
we can rewrite this equality as LT(f}) = c LT(Mog;) for some scalar c 4 0. Then 
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LT(f) = LT(cMog;), and Proposition 8.18 shows that LM( f — cMog;) < LM(f), 
as asserted. 

Iterating this construction and assuming that we never get 0, we can find 
successively nonzero scalars c;, monomials M;, and members gj, of the Grdébner 
basis such that the sequence LM ( f= ear cjMjg;,) indexed by / is strictly 
decreasing, in contradiction to Proposition 8.16. To avoid the contradiction, we 
must have f — aay c;Mjg;, = 0 for some /, and then f is exhibited as in the 
ideal (g}, ..., g,). Hence the Grébner basis generates /. 


8. Constructive Existence 


Throughout this section, K denotes a field, and we work with a fixed monomial 
ordering on K[X,,..., X,]. Idealsin K[X,, ..., X;,] will always be specified by 
giving finite sets of generators. Our objective is to obtain a constructive proof of 
the existence of a Grébner basis for each nonzero ideal in K[X,,..., X,], along 
with a useful test procedure for deciding whether a given finite set of generators of 
I is a Grobner basis. As is often the case with existence proofs, the motivation for 
the proof comes from a certain amount of deduction of properties that a Grébner 
basis must satisfy if its exists. It was mentioned in the previous section that the 
failure of a set of generators to be a Grébner basis has something to do with 
its failure to be able to represent all “small” elements of the ideal by means of 
expansions in terms of the generators that use “small” coefficients. The first part 
of this section will explore this idea, seeking to make it precise. The main step 
will be a checkable text for a set to be a Grdbner basis; this is Theorem 8.23. 
The existence argument will be an easy corollary. A by-product of the existence 
argument will be a way of testing a polynomial for membership in J. 

In the one-variable case any ideal is principal, necessarily of the form (g(X)), 
and the test for membership of a polynomial f in the ideal is to apply the division 
algorithm, writing f(X) = q(X)g(X) + r(X) with r = 0 or degr < degg. 
Then f is a member of the ideal if and only ifr = 0. The starting point for the 
several-variable theory is to do the best we can to generalize the division algorithm 
to several variables, recognizing that we cannot expect too much because of the 
complicated ideal structure in several variables. 


Proposition 8.20 (generalized division algorithm). Let (f,..., f;) be a fixed 


enumeration of a set of nonzero members of K[X,..., X,], and let f be an 
arbitrary nonzero member of K[X1,..., X,]. Then there exist polynomials 
aj,...,ds andr such that 


f=afit---+asfy +r, 
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such that LM(a; f;) < LM(f) for all j, and such that no monomial appearing in r 
with nonzero coefficient is divisible by LM(f;) for any /. 


REMARK. The proof below will stop short of giving an algorithm, because 
omitting the details of the algorithm will make the invariant of the construction 
clearer. To make the proof into an algorithm, one merely needs to be systematic 
about the choices in the proof. There is no claim of any uniqueness of a1, ..., ds 
or r in the statement; in fact, Problem 16 at the end of the chapter shows that 
more than one kind of nonuniqueness is possible. Corollary 8.21 below, however, 
will show that if the given f;,..., f; form a Grébner basis of an ideal /, then 
r is independent of the enumeration of the Grébner basis, even without the 
requirement that LM(a; fj) < LM(f) for all j. 


PROOF. We shall do a kind of induction involving decompositions of f of the 
form 


fHQfit--+asfs)+ Pp +r, (*) 
where a,...,ds, p,r are polynomials with the properties that 
(i) LM(p) < LM(f), 
(ii) LM(a; f;) < LM(f) for alli, 
(iii) no monomial M appearing inr with nonzero coefficient has M divisible 
by any LM(fi), 

and we shall demonstrate that LM(p) decreases at every step of the induction as 
long as p # 0. Initially we take all a; = 0, p = f,andr = 0. Then («) and the 
three properties hold at the start. Let us describe the inductive step. 

If LT(f;) divides LT(p) for some j, then we replace a; by aj + LT(p)/LT(fj), 
we change p to p — (LT(p)/ LT(f;)) fj and we leave r alone. The equality («) 
is maintained, and (iii) continues to hold. Since 


Lt ((LT(p)/LT(f;)) f;) = LT (LT(p)/ LT(f;)) LTCF)) 


(2) 
= (LT(p)/LT(f;)) LT(fj) = LT(p), 


Proposition 8.18 shows that LM(p) strictly decreases. Consequently (i) continues 
to hold. By the same kind of computation as for (**), 


LM ((a; + LT(p)/LT(fj)) fi) < max (LM(a; fj), LM (LT(p)/LT(f;)) fi) 
< max(LM(f), LM(p)) = LM(f), 


and therefore (ii) continues to hold. This completes the inductive step if LT(f;) 
divides LT(p) for some /. 

The contrary case is that LT(p) is divisible by LT(f;) for noi. Then we replace 
p by p — LT(p), we change r to r + LT(p), and we leave all a; alone. The 
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equality (*) is maintained, and (ii) continues to hold. Since LM(p) = LM(LT(p)), 
Proposition 8.18 shows that LM(p) strictly decreases. Consequently (i) continues 
to hold. Also, (iii) continues to hold because of the assumption that LT(p) is 
divisible by LT(f;) for noi. This completes the inductive step if LT(p) is divisible 
by LT(f;) for no i. 

Proposition 8.16 shows that the induction can continue for only finitely many 
steps. Since it must continue as long as p ¥ 0, the conclusion is that p = 0 after 
some stage, and then the decomposition of the proposition has been proved. 


Corollary 8.21. If {g,,..., gs} is a Grébner basis of a nonzero ideal J of 
K[X,..., Xn] and if f is any nonzero member of K[X1,..., Xn], then there 
exist polynomials g andr such that f = g+r, g isin ,and no monomial appear- 
ing inr with nonzero coefficient is divisible by LM(g;) for any j. Moreover, r is 
uniquely determined by these properties, and g has an expansion g = )~}_, di8i 
with LM(qa;g;) < LM(f) for all i. 


REMARKS. The uniqueness statement implies in particular that is independent 
of the enumeration of the set {g,, ..., g;}. This corollary will give us some insight 
into the way a Grobner basis can resolve cancellation. Shortly we shall introduce 
specific members of J that have cancellation built into their definition. Being in 
I, they have expansions with remainder term 0, according to this corollary. Since 
the remainder is unique, the corollary says that they can be rewritten in terms of 
the Grobner basis in a way that eliminates the cancellation. 


PROOF. For existence, let {g1,..., gs} be a Grébner basis of 7, and apply 
Proposition 8.20 to f and the ordered set (g1, ..., gs). Then the existence follows 
immediately. 


For uniqueness, suppose that f = gj +r, = g2 +12. Thenr; —r2 = go — g1 
exhibits r} — r2 as in 7. Arguing by contradiction, suppose that r; ~ rz. The 
hypothesis on r; and rz shows that no monomial with nonzero coefficient in 
r| — 2 is divisible by any LM(g;), and in particular LM(r; — rz) is not divisible 
by any of the generators of the monomial ideal (LM(g1), wha LM(g;)) =LM(/). 
Since LM(r; — rz) is a monomial in this ideal, this conclusion contradicts the last 
conclusion of Lemma 8.17. 


Suppose that X° = X{'---X¢" and xe — xi tee xhn are two monomials in 
K[X,,..., X,]. Then we define their least common multiple LCM(X“%, X*) to 
be 


LCM(X®%, X*) = X’ = xt vee Yn with y; = max(q;, B;) for all j. 


This notion does not depend on the choice of a monomial ordering. Observe 
for any two monomials M and N that LCM(M, N)/M and LCM(M, N)/N are 
monomials. 
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If fi and fo are nonzero polynomials, then the expression 


LCM(LM(fi), LM(f2)) _ LCM(LM(fi), LM(f2)) fi 
ur(fi) ti LM(fi) LC(fi) 


is a polynomial whose leading monomial is LCM(LM( Fi), LM( fr) and whose 
leading coefficient is 1. We define the S-polynomial of f; and f to be 


LCM(LM( fi), LM(f2)) LCM(LM(fi), LM(f2)) 
ir(fi) LT(f) 
This is the difference of two polynomials with the same leading monomial 


LCM( LM(f1), LM( fr) and with the same leading coefficient 1. Accordingly, 
Proposition 8.18 shows that 


S(fi, fo) = fa. 


LM(S(fi, f2)) < LCM(LM(f1), LM(f)). 


The elements S(f;, 2) are the elements mentioned in the remarks with Corollary 
8.21; the above inequality is a precise formulation of their built-in cancellation. 

Lemma 8.22 below says that whenever cancellation of this kind occurs in 
any sum of products with functions f|,..., fs, then the sum of products can be 
rewritten in terms of the S-polynomials S(f;, f,). In this way the nature of the 
cancellation has been made more transparent, partly being accounted for by the 
definitions of the individual polynomials S( fj, fx). 


Lemma 8.22. Let M and M,,..., M,; be monomials, let f;,..., f; be nonzero 
polynomials, and suppose that M; LM(f;) = M forall. Ifc,,..., cs are constants 
such that LM (7}_, ¢; M; fi) < M, then the sum )*)_, c;M; f; can be rewritten 
in the form 


Ss duM 
iM; i= J 5 ‘ 
Dee s 2 LCM( LM( fj), LM( fx)) (fi fk) 


j<k 


for suitable constants d;,. In the sum on the right side, each nonzero term has 
leading monomial < M. 


PROOF. Let us write Lj; = LCM( LM(fj), LM(fj)) fori ~ j7. We may assume 
that all the c; are nonzero, and we proceed by induction on s. There is nothing to 
prove for s = 1. The key step is s = 2, for which we are given that the M term 
of c,M, fi; + coM2 fr is 0, ie., that 


c, LC(f) + c2 LC(f2) = 0. (*) 
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Substituting for LC(f2) from («) gives 


ML S(fi, fo) = Mfi/LT(fi) — Mfo/ LTC fo) 
= M, fi/LC( fi) — M2 fo/ LC( fo) 
=) Le(fi)(eiMi fi + 22 fr), 
and this proves the displayed formula of the lemma with djy = c; LC(/1). 


Assume the result for s— 1 > 2. We are given that Sy c; LC(f;) = 0, which 
we break into two parts as 


L 
ci LC(fi) — Fer? Le( fh) = 0, 


Ss 
(2+ SEE) Leif) + DY ei Levi) = 0. 
i=3 


The inductive hypothesis gives 


cM, fi, — “oe Mp fr = dinaML7y S(fi, fr), 


Lc Ss is 
(c+ SEL) Mo fs + LeMifi= Do de ML SU. fod: 


2<j< 


Adding these two formulas, we obtain the displayed formula of the lemma for 
the case s, and the induction is complete. 


Theorem 8.23. Let {g;,..., gs} be a set of generators of a nonzero ideal J of 
K[X,,..., X,], and assume that g; 4 0 for all i. Then the following conditions 
on {g1,..., gs} are equivalent: 


(a) {g1,..-, gs} is a Grobner basis of 7, 
(b) for each pair (g;, gx.) with S(g;, gx.) 4 0, every expansion of S(g;, gx) as 
S(gj, 8k) = Y3_; Gjrgi +1 with the two properties that 
(i) LM(4jjx8i) < LM(S(g;, 9x)) and 
(ii) no monomial appearing in r with nonzero coefficient is divisible 
by LM(g;) for any j 
hasr = 0, 
(c) for each pair (g;, gx) with S(g;, gx) A 0, there is an expansion of the 
form S(gj, 8%) = doj—) 4ijegi With LM(ajjxg;) < LM(S(gj, gx))- 


REMARKS. Because of the equivalence of (b) and (c), the generalized divi- 
sion algorithm (Proposition 8.20) gives us a procedure for testing whether these 
conditions are satisfied by {g1,..., g,;}. Namely we follow through the steps in 
the proof of Proposition 8.20 in whatever fashion we please for each nonzero 
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S(g;, gx). If we get remainder r = 0 for each pair (j, k), then the conditions are 
satisfied. If we get a nonzero remainder r for some pair, then the conditions are 
not satisfied. In view of the equivalence of (a) with these conditions, we have an 
effective (though somewhat tedious) way of checking whether {g),..., g;} isa 
Grébner basis. 


PROOF. We prove that (a) implies (b) and that (c) implies (a). Since (b) 
certainly implies (c), the proof will be complete. 

Let (a) hold, ie., let {g;,..., g,} be a Grobner basis. If S(g;, g,) 4 0, then 
S(gj, gx) iS a nonzero member of J because each g; lies in J, and S(gj, gx) 
consequently has an expansion as )~}_, aig; +r withr = 0. By Corollary 8.21 it 
has a possibly different expansion withr = 0 and with LM(a;g;) < LM(S(gj, gx)) 
for each 7. On the other hand, in any expansion of S(gj, gx) as yy aigi +r 
such that (ii) holds, whether or not LM(a;g;) < LM(S(g;, gx)), 7 must be 0 by 
Corollary 8.21. This proves (b). 

To prove that (c) implies (a), we argue by contradiction. Among all expan- 
sions of members of J as )>;_, b;g; such that LT ( os bigi) is not in the ideal 
(LT(g1), footy LT(gs)), choose one for which 


M = max LM(b;8;) 
l<i<s 
is as small as possible; this choice exists by Proposition 8.16. For this choice, let 
Ss 
f =D bigi- (*) 
i=l 


Define M; = LM(b;) for each i with b; #4 0. If ip is an index with M = 
LM(b;,gi,), then M = M;, LM(g;,) by Proposition 8.18, and hence M lies in 
(LT(g1),...,LT(gs)). Since LT (>°;_, bjg;) is not in (LT(g1),...,LT(gs)), it 
follows that LT ( oar bigi) < M. Within the set {1,...,s}, define a subset E to 
consist of those i for which M; LM(g;) = M. This set contains ig, and it has the 
property that all i not in E have LM(b;g;) < M. We regroup f as 


f = Dd bg; + ¥ bg; = DS wc) Mig; + YS (6; — LT) gi + YS big;. 
idE idE 


icE icE icE 


Every term in the second and third sums on the right side has leading monomial 
< M, and so does f. Therefore LM ( ick LC(b;)Migi) < M. It follows that 
the expression ));., LC(b;)Mjgz is of the form considered in Lemma 8.22 with 
cj = LC(b;) fori € E (and c; = 0 fori ¢ E). The lemma tells us that 


>) Le(b}) Mig; = Yo djx(M/L jx) S(8;, 8x) 
icE dk 
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for suitable scalars dj,, where L jx = LCM(LM(g;), LM(gx)). 
Now we apply the hypothesis (c), expanding each S(g;, gx) in some way as 
S(gj, 8%) = Yj) 4ijegi With the a;;, equal to polynomials such that 


LM(aijx8i) < LM(S(8;, 8k). (@*) 
Substituting for S(g;, gx), we obtain 
f= Do dig(M/Ljx)aijngi + D2 i — LT) gi + Do digi. (1) 
i,j,k icE idE 


We know that every term in the second and third sums on the right side of (+) 
has leading monomial < M, and we shall estimate the leading monomial of each 
term in the first sum. Multiplying the inequality 


LM(S(g;, 8i)) < LCM(LM(g;), LM(gx)) = Ljx 
by the monomial M/L jx yields 


(M/L jx) LM(S(gj, 8x)) < M (7) 
for every pair (j,k). Combining (**) and (+) gives 


LM ((M/L jx)aijegi) = (M/L jx) LM(aijegi) < (M/L jx) LM(S(g;, 8&)) < M. 


Since each dj, is a scalar, every term in the first sum on the right side of (7) 
has leading monomial < M. Thus (+) is an expansion of a member of / that 
contradicts the minimality of max; LM(b;g;) in the expansion (*). From this 
contradiction we conclude that (a) holds. 


EXAMPLE OF A VERIFICATION THAT A SET IS A GROBNER BASIS. This example 
continues Example 2 of “Examples with lexicographic ordering” in the previous 
section. A nonzero ideal J is generated by members of K[X1,..., Xn]: of the 
form (L1,..., Ls), where each L; is a linear combination of X;,..., X,. After 
initial manipulations we assume that the matrix of coefficients of L1,..., Ls is in 
reduced row-echelon form. The assertion is that {L,,..., Ls} is then a Groébner 
basis of J. To prove this, we write L; = Xn, + 1;, where Xn, is the associated 
corner variable and /; is a linear combination of X,,+41,..., Xn» such that the 
coefficient of each corner variable is 0. If j < k, then 


S(Lj, Le) = Xn, FY Xn, = le ( Xn, +1) +4 Xn, +) = hy + UL. 


The second term on the right side contains no variable Xj, ..., Xn,» but the first 
term on the right side contains X,,._ Therefore, relative to the lexicographic 
ordering, we have LM (SL; Ly) = LM(-Lj) = LM(Ix) Xn,. Consequently 
LM(Ij;L;) < LM (S (L;, Ly) (and actually strict inequality must hold). Thus the 
displayed formula shows that S(L;, Ly) = aL; + a2L, in the form demanded 
by (c) of Theorem 8.23. Since (c) implies (a) in the theorem, {L,,..., Ls} isa 
Grobner basis of J. 
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Corollary 8.24 (Buchberger’s algorithm).!> Each nonzero ideal in the poly- 
nomial ring K[X1,..., X»] has a Grébner basis. Such a basis can be obtained by 
the following procedure: Start from any set {f,..., ;} of nonzero generators, 
apply the generalized division algorithm in some fashion to each S( fj, f,) and 
to the generating set { f;,..., f;}, and adjoin to the set of generators any nonzero 
remainders obtained from this process. Iterate this process for enlarging a set 
{fi,.--, f} of generators as long as a nonzero remainder is obtained for some 
S( f , fy). This process must terminate at some point with all remainders equal 
to 0, and the resulting generating set is a Grobner basis. 


PRooF. At the stage of the iteration that works with the set {f/,..., f/,} of 
generators, any nonzero remainder r that arises has the property that no monomial 
occurring in r is divisible by any LM(f;). By Lemma 8.17, LT(r) is not a member 
of (LT( Fi Jeg f)): However, at the next stage when r has been designated 
as one of the generators of J, LT(r) has become one of the generators of this 
ideal. Therefore the ideal (LT( Eg rerecere Bik | f))) strictly increases as we pass from 
one stage to the next. Since K[X1,..., X»] is Noetherian, its ideals satisfy the 
ascending chain condition, and this chain of ideals must stabilize. Consequently 
all the remainders must be 0 at some point, and then Theorem 8.23 shows that 
the set of generators is a Grobner basis. 


EXAMPLE OF THE COMPUTATION OF A GROBNER BASIS. We return to Example 
3 of “Examples with lexicographic ordering” in the previous section. In K[X, Y], 
we let fj(X, Y) = X* + 2XY? and fo(X, Y) = XY + 2Y¥? — 1, and we define 
I = (fi, fo). We seek a Grobner basis of J, using the lexicographic ordering. 
Direct computation gives S(f, f2) = Y(X* +2XY7) —X(XY4+2Y?-1) =X. 
Since X is not divisible by LM(f,) or by LM(f2), S(fi, fo) = Of; + Ofo + X 
is an expansion of S( fi, f2) as in Theorem 8.23c with r = X. The procedure 
of Corollary 8.24 says to adjoin f; = X to the generating set and test again. 
Direct computation gives S(fi, f3) = 1(X? + 2XY*) — X-X = 2XY, and 
S(fi, fa) = Of: + OF2 + @2Y) f3 + 0 is an expansion of S(fi, f3) as in (c), 
since LM(2Y f3) < LM (S(fi, f)). Thus S(fi, f3) gives us a 0 remainder, hence 
nothing new to process. In addition, we have S(f2, f3) = 1(XY + OY = 1) 
Y-X =2¥Y?—1. No term of this is divisible by any of the leading monomials of 
fi. fo, fg,namely X?, XY, X. Hence 2¥?—1isanonzeroremainder.!® Therefore 
we are to adjoin fy = 2Y? — 1 to our set. Computation gives S(fi, f4) = 
2xXY¥4 + X? = 2¥4+4+ X)fs, S(fo, fa) = 2Y°- V2? 4+ 5X = 5A+V hs, 


'SComputer programs typically use an improved version of this algorithm to compute Grébner 
bases. 

'6Tt was not a bad choice of decomposition that led to a nonzero remainder when some other 
decomposition might have given us 0; the equivalence of (b) and (c) in Theorem 8.23 assures us of 
that fact. 
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and S(f3, f4) = 5X = 5 J3. In every case each term has leading monomial at 
most the leading monomial of the S-polynomial. Hence all remainders are 0, and 
Corollary 8.24 says that { fi, fo, £3, f4} is a Grobner basis of J. 


Corollary 8.25 (solution of the ideal-membership problem). If J is a nonzero 
ideal in K[X,,..., X,] and f is a polynomial, then a procedure for deciding 
whether f lies in J is as follows: introduce a monomial ordering, construct 
a Grobner basis {g),..., gs} of J by means of Corollary 8.24, and apply the 
generalized division algorithm to write f = )°;_, aig; +r for polynomials 
a1,...,@,-,r such that no monomial appearing in r with nonzero coefficient is 
divisible by LM(g;) for any j. Then f lies in / if and only ifr = 0. 


PROOF. Corollary 8.24 produces the Grébner basis, and Corollary 8.21 affirms 
that this procedure decides whether f lies in /. 


Corollary 8.26 (solution of the proper-ideal problem). If J is a nonzero ideal 
in K[X,,..., X,], then a procedure for deciding whether 7 = K[X,,..., Xn] 
is to compute a Grébner basis for J and to see whether one of its members is a 
nonzero scalar c. 


ProoF. If J has a nonzero scalar as one of its generators, then | lies in /, 


and hence / certainly equals K[X1,..., Xn]. Conversely if J is given, then 
Corollary 8.24 produces a Grébner basis {g1, ..., gs}. Since LT(1) = 1 and since 
LTV) = (LT(g1), ss By LT(g;)), the monomial | must lie in (LT(g1), Snes LT(gs)). 


Since | is a monomial, Lemma 8.17 shows that it must be divisible by LM(g;) 
for some j. Therefore LM(g;) = 1. Since | is the smallest monomial in any 
monomial ordering, it is the only monomial appearing with a nonzero coefficient 
in g;. Therefore g; is a nonzero scalar. 


In many applications of Grobner bases, there is some flexibility in what mono- 
mial ordering to impose in obtaining the Grobner basis. In Corollaries 8.25 and 
8.26, for example, absolutely any monomial ordering works fine. The actual 
calculation of Grébner bases is often computationally demanding, and thus it 
is worthwhile to use such a basis that takes relatively little time to compute. 
According to computer scientists,!’ Grébner bases are the most widely useful 
when computed relative to the lexicographic ordering, but they are then also 
the most time-consuming to compute. The monomial orderings that make the 
computation of Grdbner bases proceed quickly tend to be ones that first bound 


'IThe Web essay “Representation and monomial orders,” http: //www.umich.edu/ 
~gpcc/scs/magma/text835.htm, within the publication of the Statistics and Computation 
Service listed in the Selected References contains a discussion of various monomial orders and their 
uses and advantages. 
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the total degree in one or two steps. One of the reasons that this kind of monomial 
ordering works so efficiently is that once the total degree is bounded, there are 
only finitely many monomials less than any given monomial M. 


9. Uniqueness of Reduced Grébner Bases 


In this section, K continues to denote a field, and we work with a fixed monomial 
ordering on K[X,,..., X,]. Ideals in K[X,,..., X,] will always be specified 
by giving finite sets of generators. Our objective in this section is to show how 
any Grobner basis can be “reduced” and that a “reduced” Grébner basis for an 
ideal is unique. A by-product of the uniqueness argument will be a way of testing 
two ideals for equality. 


Any finite set of generators of J that contains a Grobner basis is again a Grobner 
basis. Thus a constructed Grobner basis will often be unnecessarily large. One 
simple kind of redundance is addressed by Lemma 8.27 below. 


Lemma 8.27. If {g),..., gs} is a Grébner basis for a nonzero ideal J in 
K[X,,..., Xn] and if LM(g,) lies in the ideal (LT(g2), ssp LT(gs)), then 
{go,..., gs} is a Grobner basis of J. 


REMARK. Lemma 8.17 shows how to check whether LM(g}) lies in the ideal 
(LT(g9), eee LT(g,))5 all we have to do is see whether some LM(g;) for j > 1 
divides LM(g}). 


Proor. By hypothesis, (LT(g2), ..., LT(gs)) =(LT(g1), -..,LT(gs)) =LT(). 
Therefore {g2,..., gs} is a Grobner basis of 7. (Recall that the definition of 
Grobner basis does not assume that the set generates the ideal; Proposition 8.19 
deduces that it generates.) 


A Grobner basis {g1,..., gs} of a nonzero ideal J is said to be minimal if 
LC(g;) = 1 for all j and if no LM(g;) is divisible by LM(g;) for some j # i. 
Lemma 8.27 shows that in trying to transform a Grodbner basis into a form for 
which a uniqueness result will apply, there is no loss of generality in assuming 
that the given Grdbner basis is minimal. 


EXAMPLE. As in the example following Corollary 8.24, let J be the ideal in 
K[X, Y] given by I = (fj, fo) with f,(X, Y) = X* + 2XY? and fo(X, Y) = 
XY + 2Y? — 1. Then we saw that {f\, fo, f3, f4} is a Grobner basis of J in 
the lexicographic ordering, where /3(X,Y) = X and f4(X,Y) = oY? = 1: 
The leading monomials are LM(f,) = X?, LM(fo) = XY, LM(f3) = X, and 
LM(f4) = Y°. The first two are divisible by the third. Therefore {X, Y Fi 5} is 
the corresponding minimal Grobner basis. 
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Unfortunately an ideal can have more than one minimal Groébner basis, as is 
shown in Problem 17 at the end of the chapter. A Grobner basis {g1,..., gs} of 
an ideal J is said to be reduced if it is minimal and if for each i, no monomial 
appearing in g; with nonzero coefficient is divisible by LM(g;) for some j 47. 


Theorem 8.28 (uniqueness of reduced Grobner basis). If J is a nonzero ideal 
in K[X,,..., X,], then / has a unique reduced Groébner basis, and this can be 
obtained algorithmically starting from any minimal Grobner basis. 


PROOF OF UNIQUENESS. Let {g1,..., gs} be any Grébner basis. Since LTW) = 
(LT(g1), SO ars LT(g;)) , Lemma 8.17 shows that any LM(f) for f € J is divisible by 
LM(g;) for some j. If {h1,..., h;} is a second Grobner basis, then this argument 
shows that each LM(h;) is divisible by some LM(g;). Turned around, the argument 
shows that LM(g;) is divisible by some LM(hx). Since {h1,..., h;} is assumed 
minimal, LM(h;) cannot be divisible by LM(h;) if i 4 k. Thus LM(h;) = LM(hx), 
and these equal LM(g;). Then it follows that s = ¢ and that we may enumerate 
any two minimal Grébner bases in such a way that the leading monomial of the 
i member of each basis is the same for each i with 1 <i <s. 

With this normalization in place, let us show that g; = h;. To do so, we expand 
g; —h; as g; —h; = a ajh; with LM(g; — h;) = max; LM(a;h,) in accordance 
with (b) of Theorem 8.23. Choose k such that the maximum on the right side is 
attained at k, i.e., such that 


LM(ax) LM(hy) = LM(g; — hj). (*) 


Arguing by contradiction, suppose that the right side of () is nonzero. Then it 
must be a monomial occurring in either g; or h;. Since the two Grobner bases are 
reduced, no monomial occurring in g; is divisible by LM(g,) = LM(hx) if k 4 7, 
and similarly for monomials occurring in h;. We conclude that k = i and that 
LM(h;) = LM(g; — h;). But this is impossible by Proposition 8.18 if g; —h; 4 0, 
since LM(g;) = LM(h;) and LC(g;) = LC(h;) = 1. Therefore the right side of () 
is 0, and g; = h;. 


PROOF OF EXISTENCE. Let {g1,..., gs} be a minimal Groébner basis of 7. As 
was shown in the proof of uniqueness, the leading monomials LM(g,), ..., LM(gs) 
are independent of the choice of the actual minimal basis. Looking at the definition 
of “reduced,” we see therefore that the property of being reduced is a property of 
each member g; of the basis separately. That is, it is meaningful to say that g; 
is reduced if no monomial appearing in g; with nonzero coefficient is divisible 
by LM(g;) for some j 4 i. We shall show how to replace g; by an element g; 
with the same leading monomial in such a way that the new set is still a Groébner 
basis and g/ is reduced, and then the proof will be complete. There is no loss of 
generality in taking i = 1. 
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Applying the generalized division algorithm (Proposition 8.20), we write 


Ss 
81 = diag +r (x) 
j=2 
in such a way that 
LM(g1) = max LM(ajgj) (7) 
2<j<s 


and that no monomial appearing in r with nonzero coefficient is divisible by 
LM(g;) for any j > 2. If we define g' to be this element , then the element g; 
is reduced in the above sense, and the only question is whether {g1, 22, -++5 Bs} 
is a Grobner basis. Since {g;,..., gs} is minimal, LM(g1) is not divisible by any 
LM(g;) for j => 2. Consequently LM(g,) appears with nonzero coefficient on the 
left side of (**), and it does not appear in any of the terms a;g; with nonzero 
coefficient on the right side. Consequently it appears inr = g}, and LM(g;) < 
LM(g{). On the other hand, the equality (+) implies that LM(g}) < LM(g1). 
Therefore LM(g1) = LM(gi), and LTV) = (LT(g1), LT(g2)...,LT(gs)) = 
(LT(g}), LT(g2)..., LT(gs)). Consequently {g}, g2,..., gs} is a Grébner basis 
by definition. 


Corollary 8.29 (solution of the ideal-equality problem). Let J and J be two 
nonzero ideals in K[X1,..., Xn] specified in terms of finite sets of generators. 
Then J = J if and only if the reduced Grobner bases of J and J relative to a 
single monomial ordering are the same. 


REMARK. As with the solution of problems listed in Corollaries 8.25 and 8.26, 
the desired end is independent of the monomial ordering, and in practice one 
might just as well start from a monomial ordering for which the computation of 
Grobner bases is relatively easy. 


PROOF. This result is immediate from Corollary 8.24 (constructive existence 
of Grobner bases) and Theorem 8.28. 


10. Simultaneous Systems of Polynomial Equations 


In this section we combine our techniques concerning the resultant and Grébner 
bases to attack the original problem discussed in Section 1, that of solving systems 
of simultaneous polynomial equations in several variables. Our interest ultimately 
will be in the case that the underlying field is algebraically closed. 

Corollary 8.26 and the Nullstellensatz already combine to give a criterion for 
such a system to have no solutions: We regard the system as the zero locus of 
an ideal, and we calculate a Grébner basis for the ideal. Then the system has no 
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solutions if and only if the Grébner basis contains a constant polynomial, i.e., if 
and only if the reduced Groébner basis is {1}. 

Let us now consider the problem of finding the solutions when solutions exist. 
We begin with the case of two equations in two unknowns over the field C, 
recalling what we know from the theory of the resultant. Consider the system 


X*Y+Y* =5, 
XY =2. 


Set f (X,Y) = X°¥ + Y* —Sand g(X, Y) = XY —2. To find points (x, y) with 
ft (x, y) = g(x, y) = 0, using the style of Sections 1-3, we compute the resultant 
of f and g in the X variable, say, and obtain the polynomial Y* — 5Y? + 4Y. 
Setting this equal to 0 gives us y = 0, y = 1, andy = s(-1 +/17). We can 
then substitute each such y into x*y+ y* = 5 and get candidates (x, y). Doing so 
for y = 0 gives us no candidates, and doing so for each of the other three values 
of y gives us two values of x, differing only in a sign. So we get six pairs (x, y). 
However, only three of these satisfy the second given equation, xy = 2, one for 
each nonzero value of y. Thus the resultant gives us a handle on the problem of 
finding solutions, but it has two shortcomings: it produced a value of y yielding 
no solution pairs (x, y), and it produced extraneous x values. 

To find points (x, y) with f(x,y) = g(x, y) = 0, using the style of Sec- 
tions 7-10, we consider (f, g) as an ideal in C[X, Y], and we are interested 
in the locus of common zeros Vc((f, g)) of the ideal. We start by finding a 
reduced Groébner basis with respect to a suitable ordering. The usual lexicographic 
ordering will do fine here, and the result is {X + 4¥? — 3, Y° — 5Y + 4}. By 
what may seem to be good fortune, the second element depends on Y alone, and 
the roots are y = 1 and y = (-1 + ,/17). If we substitute these values into 
the equation x + sy" = 3 = 0, we get one value of x for each y. We can solve 
because the coefficient 1 of x is nonzero for each y in question. No pair (x, y) 
that we obtain is superfluous because the locus of common zeros of f and g is 
identical with the locus of common zeros of the members of the Grébner basis. 

This approach raises several questions about a possible generalization: 


(i) Under what conditions can we expect that a Grdbner basis for an ideal J 
in K[X, Y] will contain a member that depends just on Y? 
(ii) If the Grdbner basis contains no element that depends just on Y, then 
what can we expect? 
(iii) If we are able to solve for values of y, under what conditions can we use 
the remaining member(s) of the Grébner basis to solve for x? 
Part of the answer to (i) is contained in the Elimination Theorem proved as 
Theorem 8.30 below. This theorem says for the lexicographic ordering that the 
members of a Grobner basis that depend just on Y generate J M K[Y]; in fact, 
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they form a Grébner basis of this ideal of K [Y]. For the case that J = (f, g), the 
resultant is a member of J 7 K[Y]. Thus a nonzero resultant ensures that some 
member of the Grobner basis will depend just on Y; on the other hand, 7 K[Y] 
has to be a principal ideal in K[Y], and any Grobner basis of that principal ideal 
has to contain the ideal’s generator (up to a scalar factor). By contrast, a zero 
resultant leads us to question (ii) because it says, by Theorem 8.1, that f and 
g have a common factor h(X, Y) of positive degree in X as long as both f and 
g have positive degree in X. The largest power of X in h has as coefficient 
a polynomial in Y that has only finitely many roots, and if K is algebraically 
closed, then every y unequal to one of these roots will produce an x such that 
h(x, y) = 0 and therefore such that f(x, y) = g(x, y) = 0. In other words, 
except in degenerate cases a zero resultant implies that there cannot be a member 
of the Grébner basis that depends just on Y. Finally the answer to (iii) lies deeper 
and is contained in the Extension Theorem, which is proved as Theorem 8.31 
below. 

Let J be a nonzero ideal in K[X,,..., X,], K being any field for now. If 
0 < k < n—1, then the k" elimination ideal of J is the ideal 
IN K[Xiyi,---, Xn] in K[Xp41,..., Xa]. A monomial ordering on 


K[X,..., Xn] will be said to be of k-elimination type if any monomial con- 
taining any of X;,..., X, to a positive power is greater than any monomial in 
Xk41,---,Xn alone. The usual lexicographic ordering is of k-elimination type 


for every k. An example of a monomial ordering of k-elimination type that is of 
great interest in applications is the one of Bayer—Stillman described in Example 4 
of monomial orderings in Section 7. 


Theorem 8.30 (Elimination Theorem). Let K be any field, let J be a 
nonzero ideal in K[X,,..., X,], let 0 < k <n, and fix a monomial ordering 
of k-elimination type. If {g1,..., gs} is a Grobner basis of J, then the subset of 
members of {g1,..., gs} depending only on Xx41,..., X» is a Grobner basis of 
the k" elimination ideal J = 1 K[Xx41,-.., Xnl.- 


PROOF. Relabeling the members of {g1,..., g;}, we may assume that the g;’s 
lying in J are g1,..., g;. The first step is to show that J = (g1,..., 9). If 
f € J is given, we apply the generalized division algorithm (Proposition 8.20) 
and write f = )°}_,a;g; +r with LM(ajg;) < LM(f) for all i and with no 
monomial appearing in r with nonzero coefficient divisible by LM(g;) for any 
j. Corollary 8.21 shows that r = 0. If a; 4 0 and i is not < f, then LM(ajg;) 
involves at least one of X;,..., X;, and the definition of monomial ordering of 
k-elimination type implies that LM(a; f;) > LM(f). It follows that a; = 0 for 
i >t,and thus J = (g),..., ). 

To see that {g1, ..., g,} is a Grobner basis of J, we apply Theorem 8.23. We 
are to show for each pair (g;, gx) with S(g;, g,) AO and {j,k} C {1,..., ¢} that 
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there is an expansion S(gj, gx) = ar a; gj With LM(a;g;) < LM (S(g;, 8k))- In 

view of the argument with f in the previous paragraph, it is enough to show that 

S(g;, gx) lies in J. The formula is 

LCM(LM(g;), LM(gx)) _ LCM(LM(gi), LM(gx)) 
LT(g;) ' LT(gx) 

The coefficient fractions are members of K[X;41,..., Xn], since the monomial 

ordering is of k-elimination type, and thus S(g;, g,) is indeed in J. 


S(8j, 8) = &k- 


EXAMPLE. Formula for discriminant of a polynomial in one variable. This 
example is one that we have addressed before by specialized methods. We include 
it anyway because the use of Grobner bases allows one to solve many similar 
problems that the specialized methods do not address. By way of illustration, 
let (X — r)(X — s)(X — t) be a cubic polynomial. The discriminant is D = 
(r —s)*(s —t)*(r —t)*. This is a polynomial that is symmetric in r, s, t, and the 
general theory of symmetric polynomials (in the problems for Chapter VIII in 
Basic Algebra) shows that it has to be a polynomial in the elementary symmetric 
polynomialsa =r+s+t,b=rs+rt+st,c =rst. We seek a formula for D 
in terms of a, b, c. We form the ideal J in K[r, s, t, D, a, b, c] given by 


I= (D (r s)°(s t)(r ya (r+s+t),b—(rs+rt+st),c rst). 


With the variables enumerated as r,s, t, D, a,b,c, we use any monomial order- 
ing of 4-elimination type, the lexicographic ordering for example, and form the 
reduced Groébner basis of J. Calculation best done with the aid of a computer 
gives D — a*b? + 4b? + 4a3c — 18abc + 27c? and three other members of J that 
involve r,s, or t. Theorem 8.30 shows that the 4" elimination ideal is principal 
with generator D — a7b* + 4b? + 4a?c — 18abc +27c?. Thus the desired formula 
is D = a*b* — 4b? — 4aec + 18abe — 27c’. 


Letus come to the Extension Theorem. The statement and proof of this theorem 
do not make use of Groébner bases, but they do refer to the k" elimination ideal, 
which is identified explicitly in Theorem 8.30 with the aid of a Grobner basis. 
The intention is that the theorem be applied inductively in any application, taking 
into account one additional variable at each step of an induction. 


Theorem 8.31 (Extension Theorem). Let K be an algebraically closed field, let 
Il=(fi,..., fs) be an ideal in K[X,,..., X,], and let J be the first elimination 
ideal of J in K[X2,..., X,]. For each f;, expand f; in powers of X, as 


fi(X1,...,Xn) = gi (Xo, ..., Xn) X'|! + (lower powers of X1) 


with g;in K[X2,..., X,] and g; nonzero unless f; = 0. Suppose that (co, ..., Cn) 
lies in the zero locus Vx (J) C K"~!. If gj(c2, ..., Cn) # 0 for some i, then there 
exists c, in K such that (c,..., c,) is in the zero locus Vx (J) € K”. 
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Before giving the proof, we need to extend the theory of the resultant slightly 
in such a way that it applies to s polynomials /f|,..., , rather than just to two. 
To do so, we introduce new indeterminates U2,..., Us and regard 


F =U) fot+---+Us fs 


as a member of K[U2,..., Us, X1,..., Xn] whose degree deg, F in Xj is the 
maximum of the degrees of fo,..., fin X;. Wecan then view f; as amember of 
the same polynomial ring K[U2,..., Us, X1,..., X,] of degree deg, f; and form 
the resultant of f; and F in the X, variable. This is computed as the determinant 
of some square matrix of size deg, f; + deg, F, and we are interested only in 
the case that deg, f > 1 and deg, F > 1. When expanded in monomials 
U* = U5” .--U&, the determinant is of the form 


RGG TY Shir XU 


with each hy in K[X2,..., X,]. The polynomials hg will be called the general- 
ized resultants in the X, variable of the ordered pair (fi, {fo,..., fs}). 


PROOF OF THEOREM 8.31. Let us abbreviate X = (X>,...,X,) andé = 
(C2, ..., Cn); we Shall write 


(XT) = OieeX) and (X1,€) = (X11, €2,..., Cn). 


We seek c) € K with fj(c1,c) = 0 for all j. The assumption is that g;(c) 4 0 
for some i, and we may as well assume that this i isi = 1. If deg, f; = 0, then 
f, is in J, and the conditions that f; = 0 on Vx (J) and that g;(c) ~ 0 contradict 
one another; hence deg, f; > 1. 

As in the paragraph before the proof, put F = U2 f2+---+U; f,. Ifdeg, F = 0, 
then fj is independent of X, for all j > 2,and hence f; isin J for j > 2. In this 
case it is enough to find c; with fi(c1,c) = 0. Since gi(c) £0, fi(X1, ©) isa 
one-variable polynomial of degree /; > 1, and it is O for some value c;. Thus the 
proof is complete if deg, F = 0. 

We may therefore assume that deg, F > 1. Form the resultant in X, given by 


R(fi, F) = Voha(X)U%, 


where the h,’s are the generalized resultants mentioned above. The main step is 
to prove that each h, lies in the first elimination ideal J. Since hy depends only 
on X, it is enough to prove that each h, is in 7. We have arranged that each of /f; 
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and F has positive degree and has nonzero leading coefficient in X;, and hence 
Theorem 8.1 shows that 


afi + bF = R(fi, F) 


for some nonzero polynomials a and b in K[U2,..., Us, X1, Xj. Let the mono- 
mial expansions of a and b in terms of the U%’s be a = )°, dgU® and b = 
eg baU®. Then we have 


Fae fiU9 + (LbyVP (fil) = Thad", (*) 


Let e; be the multi-index that is 1 in the i™ place and 0 elsewhere. This has the 
property that U“ = U; for 2 <i <.s. Wecan rewrite («) as 


ThaU* = Dag fe +D( O befiUt. 
a a a  (f,i) with 

2<i<s, 

B+e;=a 


Equating the coefficients of U% on both sides gives 


he = da fi + Y be fi 
(B,i) with 
2<i<s, 
Bt+e;=a 


and exhibits h, as in /. Therefore hy is in the elimination ideal J. 
Since ¢ lies in Vx (J), ha (c) = 0 for all wa. Consequently 


R(fi, F)(U2,..., Us, €) = 0. 


Theorem 8.1 shows that f;(X,,c) and F(U2,...,Us;, X1,¢) have a common 
factor of positive degree in X; provided either or both of two specific coefficients 
are nonzero. These are the coefficients of X as I in fi(X1, ¢) and of X ra in 
F(U2,..., Us, X1, ©). The coefficient of ba fin fi (X1, X) is g\(X); thus 
the coefficient of xe fA in fi (X1, ¢) is g1(€) and is nonzero by assumption. 
Therefore Theorem 8.1 is applicable. 

The common factor of f1(X1,¢c) and F(U2,..., Us, X1,¢) may be taken to 
be prime, and then it has to be a nonzero scalar multiple of X; — c; for some 
ci € K, since that is the only kind of prime factor that divides f;|(X1,c), K being 
algebraically closed. Thus the element c; of K satisfies 


filci,¢) =0 and F(Up,...,Us,c1,€) =0. (2) 
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Writing out F, we have 
O= F(U),...,Us,¢1,€) = Ur foci, €) +--+ + Us fs(c1, ©). 


This is an identity in K[U2,..., Us], and each coefficient must be 0 on the right 
side. Thus 0 = fo(c1, C) = --- = fs(€1, €). Since (x) shows that fi) (c1, ¢) = 0, 
this proves the theorem. 


11. Problems 


How many points are in P& if K isa finite field with g elements? 


2. Resolve Cramer’s paradox as formulated in Section 1. 


3. (Euler’s Theorem) Prove that if F(X,,..., X,) is any homogeneous polyno- 
mial of degree d, then )7_, Xj 3° =dF. 


4. Let A and B be unique factorization domains, and let 1 : A — B be a one-one 
homomorphism of commutative rings with identity. For each h(X) in A[X], let 
h‘(X) be the member of B[X] obtained by applying the substitution homomor- 
phism that acts by ¢ on the coefficients and fixes X. Using resultants, prove that 
if f(X) and g(X) are two members of A[X] such that f'(X) and g'(X) have a 
common factor in B[X] that is not in B, then f and g have a common factor in 
A[X] that is not in A. 

5. Theorem 8.1 assumes that at least one of the coefficients f,, and g, is nonzero. 
Sometimes this theorem is phrased with the stronger hypothesis that f,, and gn 
are both nonzero. By comparing the resultants that are involved, show that all 
parts of the theorem with at least one of f, and g, nonzero are consequences of 
the theorem with both f, and g, nonzero. 

6. Let K bean algebraically closed field, let f and g be members of K[X1,..., Xn] 
with f irreducible, and suppose that g(a1, ..., @,) = 0 whenever f (aj, ..., an) 
= 0. Give two proofs, one using the Nullstellensatz and one using resultants, 
that f divides g. 

7. Factor the member Y? — 2XY2 + 2X2Y — 4X3 of C[X, Y]3 into first-degree 
factors. 


8. Find the intersections in Pe of the zero loci of the projective plane curves 
F(X, Y, W) = X(Y¥? — XW)? — Y° and G(X, Y, W) = ¥4+ Y°W — X?w?. 

9. Let A be a unique factorization domain, let B = A[Y,..., Yn, Z1,..-, Zn], let 
F and G be the polynomials in B[X] given by 


j 
and let R(Y1,..., Yn, Z1,.--, Zn) be the resultant R(F, G) with respect to X. 
(a) Show that R(Y,..., Yn, Z1,.-., Zn) equals 0 if Y; is set equal to Z;. 


F(X) = [l (CL=P3) and = G(X) = i (X — Z;), 
i=l aS 


10. 


11. 


12. 


13. 


14. 


15; 


16. 
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(b) Deduce from (a) that ¥; — Z; divides R(Y),..., Ym, Z1,..., Zn). 

(c) Deduce from (b) that R(%,..., Yn, Z1,...-, Zn) = eT], (Y; — Z;) for 
some c 4 0 in A depending on m and n. 

Let f(X) be in K[X], K being a field, and let f’(X) be the derivative of f(X). 

Using the result of the previous problem and the computation at the beginning 

of Section V.4, prove that R(f, f’) is a nonzero multiple of the discriminant of 

f, the multiple depending only on deg f. 

Let F and G be the homogeneous polynomials given by F(X,Y,W) = 

(X? + Y*)? + 3X2YW — Y3W and G(X, Y, W) = (X24 Y*)? —4x?y?w?. 

Calculate 7(P, F 1G) for P = [0, 0, 1]. 

Let G be a nonconstant homogeneous polynomial in K[X, Y, W]g vanishing at 

a point P of Py let m = m p(G) be the order of vanishing of G at P, and let L 

be a projective line through P. Show from the definitions that L is a tangent line 

to G at P in the sense of Section 5 if and only if i(P, L NG) > m+ 1 in the 

sense of Section 4. 

Deduce relative to an arbitrary monomial ordering the (nonconstructive) exis- 

tence of a Grobner basis for a nonzero ideal J in K[X,,..., X,] from the form 

of a set of generators of the ideal LT(/). 


For 1 <i <n, let w be the weight vector wo = (w?, wae w) in R”, and 

suppose that these vectors are linearly independent. Show that the w“ define a 

monomial ordering as in Example 5 of Section 7 if and only if for each j, the 

first i with w\” 40 has w;” > 0. 

This problem shows for two variables that every monomial ordering arises from a 

system of two independent weight vectors satisfying the condition in the previous 

problem. Let a monomial ordering be imposed on K[X, Y]. 

(a) If X > Y4% for all g > O, show that the ordering is lexicographic and is 
determined by the system of two weight vectors {(1, 0), (0, 1)}. 

(b) If X < Y% for some gq > 0, show that there exists a unique real number 
r > 0 such that for all ordered pairs of integers u > O and v > 0, X” > Y” 
ifru >vand X" <Y" ifru <v. 

(c) If X < Y% for some q > 0 and if r is defined as in (b), prove that the 
monomial ordering is determined by the system of two weight vectors 
{(r, 1), (s, t)} for a suitable (s, ft). 

In K[X, Y], define f(X,Y¥) = X*Y + XY? 4+ Y?, f\(X,Y) = XY —1, and 

fo(X, Y) = Y? — 1. Show that 


{®%Y=K4t NY fit lfeatn=xXfii+X+Dft4+1r2 


with 7) (X,Y) = X + Y +41 andrz = 2X + | gives two decompositions in the 
lexicographic ordering of f relative to { f1, fo} satisfying the conditions of the 
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generalized division algorithm of Proposition 8.20. Conclude that the remainder 
term need not be unique, nor need the coefficients of f; and fo. 


17. Observe for any scalar a that the ideal J = (X? + cXY, XY) in K[X, Y] is 
independent of c. 
(a) Verify that {X 2 4 ¢XY, XY} is a minimal Grobner basis of J relative to the 
lexicographic ordering for any choice of c. 
(b) Show that {X 2 XY } is the reduced Grobner basis for J. 


Problems 18-20 characterize ideals in K[X ,..., X;,] whose locus of common zeros 

is a finite set under the assumption that K is an algebraically closed field. Thus let 

K be an algebraically closed field, and let J be a nonzero ideal in K[Xj,..., Xn]. 

18. Under the assumption for each j with 1 < j <n that J contains a nonconstant 
polynomial P;(X;), prove that Vx (/) is a finite set. 


19. Conversely under the assumption that Vx (/)) is a finite set, use the Nullstellensatz 
to produce for each j, a nonconstant polynomial P;(X;) lying in J. 

20. Impose the usual lexicographic ordering on monomials. Prove that LT(/) con- 
tains some ee for each j with | < j < n if and only if Vx(J) is a finite 
set. (Educational note: The advantage of this characterization over the one in 
Problems 18-19 is that checking this one is easy by inspection once a Grobner 
basis of J has been computed.) 


Problems 21-23 relate solutions of simultaneous systems of polynomial equations to 

the theory of the Brauer group in Chapter III. A field L is said to satisfy condition 

(C1) if every homogeneous polynomial of degree d in n variables with d < n has a 

nontrivial zero. The significance of this condition was shown in Problem 20 at the 

end of Chapter III: the Brauer group B(L) of such a field is necessarily 0. The present 
set of problems establishes that a simple transcendental extension of an algebraically 
closed field satisfies condition (C1). No knowledge of Chapter III is needed for these 
problems, but Problem 23 will take for granted a certain theorem to be proved in 

Chapter X. 

21. Let K bean algebraically closed field, and let L = K (X) beasimple transcenden- 
tal extension. It is to be shown that any member F(T), ..., T,) of L[T), ..., Tala 
of the form F(T|,...,T,) = i a i, Uiy-in Ti! a Tn has a nontrivial zero if 
d <nandeachaj;,,__;, lies in the field L = K(X). 

(a) Why is it enough to consider such polynomials with each aj, 
polynomial ring K[X]? 

(b) With the simplification from (a) in place, let 6 be the maximum degree in 
X of the coefficients q;,..;,. Let N be a positive integer to be specified. By 
looking for a solution of the form 7; = v0 bij X J with each b; ; in K ,show 
that substitution of this formula into the formula F(7),...,7,,) = 0 leads 
to a system of homogeneous polynomial equations over K in the unknowns 
b;;, one of each degree from 0 to 6 + Nd. 


in the 


gerey In 


22. 


23. 
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(a) In the setting of the previous problem, show that the number of unknowns 
is (NV + 1)n and that the number of equations is at most Nd + 6+ 1. 

(b) Show for N sufficiently large that the number of equations is less than the 
number of unknowns. 


The following theorem will be discussed in Chapter X: if K is algebraically 
closed and if m < n, then the locus of common zeros in P% of m nonconstant 
homogeneous polynomials in K[X1,..., Xn+41] is nonempty. Assuming this 
theorem, deduce from the previous two problems the conclusion that the field 
L = K(X) satisfies condition (C1) if K is algebraically closed. 


CHAPTER IX 


The Number Theory of Algebraic Curves 


Abstract. This chapter investigates algebraic curves from the point of view of their function fields, 
using methods analogous to those used in studying algebraic number fields. 

Section | gives an overview, explaining how Riemann’s theory of Riemann surfaces of functions 
ties in with the notion of an algebraic curve and explaining how such curves can be investigated 
through the discrete valuations of their function fields. It is shown that what needs to be studied is 
arbitrary function fields in one variable over a base field. It is known that every compact Riemann 
surface can be viewed as an algebraic curve irreducible over C, and thus the function fields of 
compact Riemann surfaces are to be viewed as informative examples of the theory in the chapter. 

Section 2 introduces the notion of a divisor, which is any formal finite Z linear combination of 
the discrete valuations of the function field that are trivial on the base field, and the notion of the 
degree of a divisor, which is the sum of its coefficients weighted suitably. Each nonzero member 
x of the function field gives rise to a principal divisor (x), and the main result of the section is that 
the degree of every principal divisor is 0. This is an analog for function fields of the Artin product 
formula for number fields. 

Section 3 contains the definition of the genus of the function field under study. The main object 
of study is the vector space L(A) for a divisor A; this consists of 0 and all nonzero members x of 
the function field such that (x) + A is a divisor > 0. Roughly speaking, it may be viewed as the 
space of functions on the zero locus of the curve whose poles are limited to finitely many points and 
to a certain order depending on the point. The genus is defined in terms of dim L(A) — deg A when 
A is a divisor that is a large multiple of the pole part of any fixed principal divisor. The main result 
of the section is Riemann’s inequality, which says that dim L(A) > deg A+ 1 — g for all divisors 
A, where g > 0 is the genus, and that g is the smallest integer that works in this inequality for all 
divisors A. 

Sections 4-5 concern the Riemann—Roch Theorem, which gives an interpretation of the difference 
of the two sides of Riemann’s inequality as dim L(B) for a suitable divisor B that can be defined in 
terms of A. Section 4 gives the statement and proof of the theorem, and Section 5 gives a number 
of simple applications. 


1. Historical Origins and Overview 


As was mentioned in Chapter VIII, modern algebraic geometry grew out of early 
attempts to solve simultaneous polynomial equations in several variables and out 
of the theory of Riemann surfaces. Chapter VIII discussed the impact of the first 
of these sources, and the present chapter discusses the impact of the second. 
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The theory of Riemann surfaces was begun by Riemann and continued by 
Liouville, Abel, Jacobi, Weierstrass, and others. This section discusses briefly 
the point of view in these studies, which began as an effort to solve a problem in 
real analysis, moved into complex analysis, and finally arrived at investigations of 
affine plane curves over C, but from a point of view quite different from the one in 
Chapter VIII. The end result is a study of the curve through the functions on its zero 
locus, and the approach has something in common with the approach to algebraic 
number theory in Chapter VI. It is not necessary to understand the background in 
maximum generality, and we shall be content with suitable examples. 

Riemann was interested in saying something useful about seemingly intractable 
integrals like the one arising from the arc length of an ellipse; let us take 


_ w= fo dt 
a a (—-ae¢—-bG—c) 


where a, b,c are distinct constants, as a specific example. The lower limit of 
integration is unimportant, since it affects the value of the integral only by an 
additive constant. We sketch an analysis of the integral,! proceeding formally for 
the moment. Although y as a function of x seems intractable, any sort of inverse 
function has nice properties. The formula for y gives us 
dx 
Vx = ae — bye =) 
and an inverse function x = x(y) thus has derivative 
dx 
— = J/(x —a)(x — b)(x —c). 
dy 


Consequently we should expect that 


dy = 


dx\2 
(=) = (x —a)(x — b)\(x —). 
dy 

Of course, the singularities at a, b, c are problematic, and the square root might 
have a negative argument, depending on the location of x. 

Riemann’s starting point for a rigorous investigation was to let x be complex, 
rather than real, and to let the integral be taken over paths in C. The result is 
then not an ordinary function y(x), since the square root in the integrand is not 
a well-defined function for t in C — {a, b, c}. We can make a choice for which 
the square root is well defined, however, as long as we restrict attention to a 
small neighborhood of a particular t. Thus we can visualize small overlapping 
disks each centered at a point along an arbitrary path of integration with f in 
C — {a, b, c} with the property that the integrand is well defined on each such 


'For more details one can consult the author’s book Elliptic Curves, pp. 165-183. 
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disk. The interpretation of the square root may be assumed to match on the 
intersection of any two disks. When a path goes around one or more of the 
singularities and we return to the same f, we view the new disk as the same as 
the old one if the values of the square root match, but as different if the values do 
not match. The union of the disks with this convention becomes a new domain 
of interest, and the function F(t) = /(f —a(t— b)(t —c) on C — {a,b,c} 
becomes a well-defined function F (¢) on this new domain. This new domain is a 
relatively simple example of a Riemann surface, i.e., a connected 1-dimensional 
complex manifold. 


In more modern language the new domain is a twofold covering of the three- 
times punctured plane C — {a, b, c}, obtained as follows. We fix a base point zo 
in C — {a, b,c} and define a winding number for each of the points a, b,c as 
usual. The subset of the fundamental group of C — {a, b, c} for which the sum of 
the three winding numbers is even is a subgroup and corresponds, via standard 
covering-space theory, to a certain twofold covering space R of C — {a,b,c}, 
the covering map being called e. This covering space is a new domain on which 
the integrand is well defined. On each fiber of the covering, e is two-to-one. Let 
o be one of the two preimages of zo. Let us adjoin points a*, b*, c*, oo* to the 
covering space R and extend e by the definitions e(a*) = a,e(b*) = b,e(c*) =c, 
e(co*) = oo. One can show that the complex structure extends from FR to the 
enlarged space ?* in such a way that the extended e is a holomorphic function 
from R* onto C U {oo}. The enlarged space ?* becomes a compact Riemann 
surface, and the extended e is a branched covering of the Riemann sphere CU {oo}. 
Topologically 7* turns out to be a torus, as we shall see in a moment. 


Riemann in his own investigations went on to study the function theory of 
compact Riemann surfaces. The interest is in deciding whether there is a globally 
defined meromorphic function with poles/zeros only at chosen points and with 
poles/zeros at most/least of some specified order. If there is such a function, 
one wants to know the dimension of the space of such functions. The basic tool 
for addressing this question is the Riemann—Roch Theorem. In the context of 
Riemann surfaces, the Riemann—Roch Theorem has both an analysis aspect and an 
algebraic aspect. The analysis aspect may be viewed as using the theory of elliptic 
differential operators to prove existence of enough nonconstant meromorphic 
functions for the Riemann surface to acquire an algebraic structure. For the 
purposes of this book, we can just accept this circumstance and not try to extend 
it in any way; however, we will sketch in a moment how the algebraic structure 
can be obtained concretely for our example. The algebraic aspect may be viewed 
as mining this algebraic structure to deduce as many dimensionality relations 
as possible among the function spaces of interest. This is the theory that we 
shall want to extend; we return to our method for carrying out this project after 
producing the algebraic structure for our example by elementary means. 
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To introduce the algebraic structure in our example, we use our knowledge of 
Fe* to make sense out of the expression 


C= i F(g)"!de 


for any piecewise smooth curve C on 7* that starts from the base point fp. 
If C is given by C(t) for ¢ in an interval 7, then this integral is to be equal to 
w(C) = fa F(C(t))(eoC)'(t) dt. Let Ug, Ty, - be small loops in C — {a, b, c} 
respectively about a, b, c based at zo, each having winding number 1, and define 
Tl, =I, andl, =T;,.;. Lift [; and Pz to curves I’; and Ti in R* based at 
o, and define 


o =f F(¢y!d¢ and n= [ F(gylde. 
Tr, T) 


It turns out that A = Zw, + Za? is a lattice in C and that there is a well-defined 
function w : R* — C/A such that whenever ¢ is in 7e* and C is a piecewise 
smooth curve from ¢ to ¢, then w(¢) = w(C) mod A. The function w() is one- 
one onto and is biholomorphic. In particular, R* is exhibited as homeomorphic 
to a torus. 

Let w-! : C/A — R* be the inverse function of w, and let uw: C > C/A be 
the quotient map. Then the composition P = eow™! op carries C to CU {oo} and 
can be seen to satisfy P’ = (P —a)(P —b)(P —c). In other words, P has been 
constructed rigorously as an inverse function to the original integral. Except for 
small details, P is the Weierstrass go function for the lattice A in C. It is almost 
true that zh (P(z), P’(z)) is a parametrization of the zero locus of the affine 
plane curve y? — (x — a)(x — b)(x — c) defined over C. The sense in which this 
parametrization fails is that P(z) takes on the value oo at certain points. What 
happens more precisely is that z + [P(z), P’(z), 1] is a parametrization of the 
zero locus of the projective plane curve Y?*W — (X —aW)(X — bW)(X —cW). 


Our initial focus in this chapter is in mining this kind of algebraic-curve 
structure over C to deduce as many dimensionality relations as possible among 
interesting finite-dimensional subspaces of scalar-valued functions on the zero 
locus of the curve. For instance in the example above, one can ask for the 
dimension of the space of meromorphic functions on 7* with at worst simple 
poles at two specified points and with no other poles. The main theorem of this 
chapter, the Riemann—Roch Theorem, gives quantitative information about the 
dimension of this space and of similar spaces. The goal for this introduction is 
to frame this question as an algebra question about the algebraic structure and 
to see that some basic tools introduced in Chapter VI in the context of algebraic 
number theory are the appropriate tools to use here. 
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The primary object of study is the “function field” of the curve in question. 
Let us construct this function field for our example. The ideal 


I = (¥* — (X —a)(X — b)(X —c)) 


in C[X, Y] is prime, and the restrictions of all polynomial functions to its zero 
locus V(/) may be identified with the integral domain R = C[X, Y]/J by the 
Nullstellensatz. It takes a little argument, which we omit, to justify saying that the 
meromorphic functions on the zero locus may be viewed as the field of fractions 
F of CLX, Y]/J; suffice it to say for the moment that we insist that the behavior at 
all points of the locus, including any points on the line at infinity in the projective 
plane, be limited to poles and zeros, and that is why nonrational functions of 
(X, Y) do not appear. At any rate, F is what is taken as the function field of the 
curve. To have obtained a field by this construction, we could have started with 
any affine plane curve f(X, Y) over C as in Chapter VIII, except that the principal 
ideal (f (X, Y)) in C[X, Y] has to be assumed to be prime to yield an integral 
domain as quotient. That is, f(X, Y) has to be an irreducible polynomial; we say 
that the affine plane curve f(X, Y) has to be assumed to be irreducible over C. 

The study of members of the function field F from the point of view of their 
poles and zeros is analogous to the problem of studying factorizations in the 
number-theoretic setting. This point was already made in Section VIII.7 of Basic 
Algebra, where the case of the affine plane curve above in which (a,b,c) = 
(O, +1, —1) was studied in detail. For this one choice of (a, b,c), the integral 
domain R = C[X, Y]/I was observed to be the integral closure of C[X] in a 
finite separable extension of C(X), and it is a Dedekind domain by Theorem 8.54 
of Basic Algebra; in fact, the same argument works for any choice of (a, b, c) as 
long as a, b, c are distinct complex numbers. 

Unique factorization of elements into prime elements fails in this R, but we 
saw that a geometrically meaningful factorization instead is the factorization of 
nonzero ideals into prime ideals. This latter factorization is unique because R is a 
Dedekind domain. Meanwhile, since nonzero prime ideals are maximal in R, the 
Nullstellensatz shows” that the nonzero prime ideals in R correspond exactly 
to the points of the zero locus V(/). Consequently the unique factorization of 
nonzero ideals in R has the geometric interpretation of associating orders of zeros 
and poles to members of R. This all seems very tidy, but there are at least three 
awkward matters that we need to take into account: 


*Let y : C[X, Y] — R be the quotient homomorphism. If M is a maximal ideal in R, then 
gy! (M) is a maximal ideal in C[X, Y] and hence is the set of all polynomials vanishing at some 
(x0, yo). To show that (xg, yo) is in V(/), assume the contrary. Then there exists g € / with 
g(Xo, yo) # 0. This g is not in the maximal ideal g—!(M), and thus there exist f €¢ g~!(M) and 
h € C[X, Y] with f + gh = 1. Applying ¢, we obtain g(f) = 1, in contradiction to the fact that 
y(f) lies in the proper ideal M of R. 
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(i) we have not included information about zeros and poles at the points at 
infinity when the curve is viewed projectively, and that information surely 
plays some role, 

(ii) the analysis of the function field F seems to rely on a subfield C(X) for 
which there is surely no canonical description, 

(iii) the ring R no longer need be integrally closed if a, b, c are not assumed 

distinct, if for example (a, b,c) = (0, 0, 1). 

Point (ii) turns out to be an advantage, allowing us to work with the given curve 
from multiple perspectives. The “key observation” at the end of this section will 
make clear how we can take advantage of (ii). 

Point (iii) is quite significant. The trouble with the curve Y* — X*(X — 1) is that 
the curve has a singularity at (0, 0) in the sense of Section VII.5. The maximal 
ideals of the ring C[X, Y}/(¥? — X7(x — 1)) correspond to points on the zero 
locus of the curve; but the ring is not a Dedekind domain, and we have few tools 
for working with it. To handle matters properly, we have to form the function 
field directly as F = C(XyY1/(Y? =e 04 1)) and define R to be the integral 
closure of C[X] in F. This ring R is bigger than C[X, Y)/(? SOX = 1)) and is 
a Dedekind domain. Unfortunately its nonzero prime ideals no longer correspond 
exactly to points of the zero locus. Example 1 below will illustrate. What happens 
is that F readily provides information about the behavior of nonsingular points 
of the zero locus but not about singular points. Problems 5—11 at the end of the 
chapter address this matter for nonsingular points for affine plane curves more 
generally. The tool for making the connection for curves in higher dimension is 
Zariski’s Theorem (Theorem 7.23), and we shall carry out the details in Chapter X 
when we treat the geometry of curves, as opposed to the number theory. 

Point (i) is relevant and is easily handled. When we form the function field 
of the curve and take R to be the integral closure of CLX] in it, we can associate 
C[X] with the polynomials of C and think of them as embedded in the field C(X) 
of rational functions. The rational functions are all meaningful on the Riemann 
sphere C U {oo}, and we study behavior of rational functions near oo by writing 
them in terms of X~! and regarding X~! as a new variable that is near 0. In 
studying our curve, the points in the projective plane that we miss by considering 
just the affine curve are the ones that lie over oo in the Riemann sphere. We 
study them by considering the integral closure R’ of C[X~'] in F. If the curve is 
nonsingular at all points lying over oo, then these points correspond to the prime 
ideals of R’ whose intersection with C[X~'] is the prime ideal X “lx "| of 
C[xX~']. 


EXAMPLES. 
(1) Affine plane curve f(X,Y) = Y* — X?(X — 1). This polynomial is 
irreducible over C but is singular at (0,0) in the sense that of and aE both 
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vanish there. Let F = C(X)[Y]/(f(X, Y)), and let x and y be the images of 
X and Y in F. These elements lie in the ring § = CLX, Y]/(f(X, Y)), whose 
maximal ideals correspond to points on the zero locus by the Nullstellensatz. 
All members of S are of the form a(x) + yb(x), where a and Db are arbitrary 
polynomials in one variable. Any proper ideal in S containing x has to be of the 
form (x, yc,(x), ..., yCn(x)) for some polynomials c;, ..., c,. A little argument 
using the fact that C[x] is a principal ideal domain shows that the ideal is of the 
form (x, yc(x)). Using products of x and polynomials, we see that we can discard 
all terms of c(x) but the constant term. Hence the ideal is either (x) itself or is 
(x, y). The ideal (x) is not prime, since y - y is in it and y is not in it. The ideal 
(x, y) is maximal and hence prime. Since (x, y)? = (x7, xy, y?) = (x?, xy) is 
properly contained in (x), (x) is not the product of prime ideals in S. Thus S$ is 
not a suitable ring for investigating poles and zeros of members of the field F. 
By contrast, a little computation shows that the integral closure R of C[x] in F 
is generated as a C algebra by x and x~!y. This is a Dedekind domain, and the 
decomposition of the ideal (x) in R as a product of prime ideals can be checked to 
be (x) = (x, x7!y+i)(x,x7!y —i). A factor on the right does not consist of all 
functions vanishing at some (0, yo) lying on the zero locus. The only point (0, yo) 
on the zero locus is (0, 0), and the two prime factors of (x) say something about 
derivatives at that point. This example will be considered further in Problems 
21-22 at the end of the chapter. 


(2) Affine plane curve f(X, Y) = Y?— X*+1. This polynomial is irreducible 
over C and is nonsingular at every point of its zero locus inC?. Again we form the 
function field F, the members x and y of it, and the ring CLX, Y]/(f (X, Y)). Us- 
ing the fact that X4—1 is square free, we can check that this ring is the full integral 
closure R of C[x] in F. The ring R is a Dedekind domain, and its elements are all 
expressions a(x) and yb(x), where a(x) and b(x) are polynomials. Moreover, 
we have (y + x*)(y —-x)= y? =a = Dat a T, Consequently 
the elements y + x? are nonconstant units in R, and they cannot have zeros or 
poles on the zero locus of f (X, Y) in C*. Thus knowledge of the orders of zeros 
and poles at every point of the zero locus of f(X, Y) in C? does not determine 
a member of R up to a constant factor. Instead, we have to take into account 
the behavior at any points at infinity on the zero locus in the projective plane Pe. 
To see what this set is, we convert f(X, Y) into a homogeneous polynomial of 
degree 4, specifically into F(X, Y, W) = Y*W* — X* + W%, and then we look 
for points [x, y, w] with F(x, y, w) = 0 and w = 0. These have x = 0 and thus 
come down to [0, y, 0]. In other words, there is only one point at infinity on the 
zero locus of the curve. It is singular because all three partial derivatives of F 
are 0 there. The fact that it is singular means that we should not expect the 
prime ideals lying over x~!C[x7!] in the integral closure R’ of C[x~!] in F to 
correspond to the points at infinity on the curve. We return to this example shortly. 
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All these matters begin to sound quite complicated to sort out, but magically 
there is a simple way of handling them: for an affine plane curve irreducible 
over C, we work with the field F of rational functions for the curve, ignoring 
the geometry of the curve, and we consider all discrete valuations on this field 
that are 0 on C*. Discrete valuations were discussed at length in Section VI.2. 
They depend only on F, not on the choice of a subring for which F is the field 
of fractions. As will be seen in Chapter X, the full set of discrete valuations of 
F gives information about all potential nonsingular points for any affine curve 
with function field F, not necessarily planar; there will even be such a curve 
whose extension to be defined projectively is everywhere nonsingular, and then 
the points on the zero locus of the curve in projective space will be in one-one 
correspondence with the discrete valuations of F. 

Let us review what Chapter VI tells us about discrete valuations in our set- 
ting. Let f(X, Y) be an irreducible polynomial in C[X, Y], let F be the field 
C(X)[Y]/Cf (X, Y)), let x and y be the images of X and Y in F, and let R be the 
integral closure of C[x] in F. This is a Dedekind domain by Theorem 8.54 of 
Basic Algebra. Corollary 6.10 classifies the discrete valuations of F that are 0 on 
C*. It shows that all but finitely many correspond to prime ideals in R. There 
are only finitely many others. Corollary 6.10 tells us that these other discrete 
valuations can be described in terms of the integral closure R’ of C[x~!] in F; 
this is another Dedekind domain whose field of fractions is F. The exceptional 
discrete valuations of F arise from those prime ideals of R’ that occur in the 
decomposition of the ideal x~!R’ into prime ideals of R’. Geometrically we may 
view these additional discrete valuations as associated in some way with points at 
infinity in a projective space, but we can proceed with algebraic manipulations of 
these discrete valuations without invoking the geometric interpretation or using 
projective space. 


EXAMPLE 2, CONTINUED. We continue with the affine plane curve Y?— X++1, 
the prime ideal J = (Y? — X++1), and the ring R given as the integral closure of 
C[X] in the field F = C(X)[Y]/J. Corollary 6.10 divides the discrete valuations 
of F that are 0 on C% into two kinds. The ones of the first kind are built from the 
nonzero prime ideals of R. Since y + x? are units in R, all of these valuations 
take the value 0 on y + x”. The discrete valuations of the second kind are those 
appearing in the decomposition of the ideal x~!R’ in the integral closure R’ of 
C[x~'] in F. The element x~y is in R’ because it is a root of the polynomial 
Y? — (1 — x74) in C[x7!][Y]. Hence R’ contains x~! and x~*y. On the other 
hand, the most general element of F is of the form a(x~!)x~?y + b(x~!), where a 
and b are rational expressions in one variable, and this is a root of the polynomial 


Y? —2b(x!)¥ + (be!) — ae! PU — x). 
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For this element to be in R’, the coefficients must be in C[x~!]. This means that 
b(X) is a polynomial and that a(X)?(1 — X*) is a polynomial. Since 1 — X* has 
no repeated roots, the latter condition forces a(X) to be a polynomial. Thus x7! 
and xy generate R’ as a C algebra. Define ideals in R’ by 


Py = (x71, x7y +1) and = Py = (x1, x7*y— 1). 
Then it is straightforward to check the decompositions 
(x7!) = P\ Po, (x?y +1) = PH, and = (x *y —1) = Py. 


Since [F : C(x7!)] = 2 and since x~! is prime in C[x~!], the ideal (x~!) in R’ is 
the product of at most two prime ideals, and it follows that P; and P are prime 
ideals in R’. They are distinct because the difference of the respective second gen- 
erators is anonzero scalar. In view of Corollary 6.10, there are exactly two discrete 
valuations of F that are 0 on C* other than the ones coming from prime ideals of 
R, and these are the ones coming from the prime ideals P; and P2 of R’. Let us call 
them v; and v2. The above decompositions of principal ideals give vj (y +x”) = 
u(x)? + u(a-2y + 1) = (—2) + (44) = 42, whereas u(y — x?) = 
(—2) + (0) = —2. Thus v; takes the distinct values 0, +2, and —2 on 1, y + x2, 
and y — x?. Similarly v2 takes the values 0, —2, and +2 on these elements. 


We shall work with those discrete valuations of the field of rational functions 
for the curve under study that are 0 on the base field. These are canonical, 
independent of our choice of some Dedekind domain whose field of fractions is 
the given field. However, making a choice of Dedekind domain is convenient 
for making calculations. Then we can consider the discrete valuations as of two 
kinds, and which discrete valuations are of which kind will depend on our choice 
of Dedekind domain. 


Context for the study in this chapter. Having concluded that the object to 
investigate is the field of rational functions of our curve and that the tools include 
the discrete valuations, we can now consider the context in which we should 
work. Let k be any field, not necessarily algebraically closed. We want to work 
with the “function field” of a suitable kind of curve defined over k. If J is an ideal 
in k[X,,..., X,], then the ring R = k[X,,..., X,]// is an integral domain if 
and only if the ideal J is prime, and in this case the field of fractions F of R can 
be taken to be the associated function field. Thus we restrict attention to the case 
that J is prime. To bring in the notion that the curve is to be 1-dimensional, we 
recall from Theorem 7.22 that the integral domain R has Krull dimension 1 in 
the sense of Section VII.4 if and only if the field of fractions F has transcendence 
degree | over k. In this case, F is finitely generated as a field over k, with a finite 
set of generators consisting of the elements x; = X; + J for 1 < j <n. That is, 
F is a function field in one variable over k. 
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Conversely if F is a function field in one variable over k, then F is a finite 
algebraic extension of a simple transcendental extension k(x,). Let us write it as 
F = k(x,)[%2, ..., Xn] for some n. Form the polynomial ring k[X,,..., X,,] and 
the ring homomorphism of this ring into F that fixes k and sends X; into x;. The 
image of this homomorphism is an integral domain R whose field of fractions is 
F, and the kernel is a prime ideal J such that R = k[X,,..., X,]/1. Theorem 
7.22 tells us that R has Krull dimension 1. 

We are led to the following definition. For any field k and any integer n > 1, 
an ideal J in k[X,,..., X,] is called an affine curve irreducible® over k if / is 
prime and the integral domain R = k[X,,..., X,]// has Krull dimension 1. An 
affine plane curve (f (X, Y)) in the sense of Chapter VIII will be an object of this 
kind if f (X, Y) is an irreducible polynomial .* 

The geometry of the zero loci of the curves we study will not play a role in the 
mathematics of this chapter; only the field of fractions F and the base field k will. 
We postpone to Chapter X any discussion of the geometry.» For any function 
field F in one variable over an arbitrary field k, we shall study in detail those 
discrete valuations of F that are 0 on k. We refer to such discrete valuations as 
the discrete valuations of F defined over k. It will be helpful as motivation to 
remember for the special case in which k is algebraically closed 


e that the members of F may be viewed as all rational functions on the zero 
locus of an affine curve irreducible over k, 

e that the order-of-a-zero function at any nonsingular point of this zero 
locus gives an example of a discrete valuation of F defined over k, and 

e that all discrete valuations of F defined over k arise in this way if the 
zero locus is nonsingular at every point and we take into account points 
at infinity in projective space. 


However, the formal development will not make use of these interpretations. 


3Beware of assuming too much irreducibility about such a curve. Just because J is prime does 
not mean that J remains prime when we extend the scalars and work with an algebraic closure Kajg 
of k. For example, X? + Y? is an affine curve irreducible over R, but it factors as (X +iY)(X —iY) 
over C and is therefore not irreducible over C. 

4This change of context for the word “curve” from the definition in Chapter VIII is appropriate 
because of a change of emphasis: we shall now be studying an associated function field rather than the 
defining ideal. The word “curve” will undergo a genuine change in meaning in Chapter X: because 
of the Nullstellensatz, classical algebraic geometry in the form to be discussed in much of Chapter X 
places emphasis on zero loci defined by prime ideals of polynomials over an algebraically closed 
field, and it will be convenient to define the curve to be the zero locus rather than the defining ideal. 

In Chapter X we shall introduce two distinct notions of sameness for the zero loci under the 
assumption that the field is algebraically closed, namely “isomorphism” and “birational equivalence.” 
The first is a refinement of the second. Birational equivalence will turn out to mean that the function 
fields are isomorphic. An important theorem says that each birational equivalence class of irreducible 
curves contains one and only one isomorphism class of curves that are everywhere nonsingular in 
the sense of Section VII.5. 
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What to expect from the study. When k is not necessarily algebraically closed, 
these interpretations break down, at least to some extent. Yet the main theorem of 
the chapter, the Riemann—Roch Theorem, is still geared to the geometric interpre- 
tation of discrete valuations in terms of poles and zeros. One may reasonably ask 
why one goes to the trouble of working in such a general context that the theory no 
longer has its geometric interpretation. The answer is that the investigation is to 
be regarded as one in number theory, not in geometry. For example, studying an 
affine plane curve over a field F, is the same as studying solutions of congruences 
in two variables modulo a prime. Studying such a curve over the p-adic field Q, 
is the same as studying solutions of such congruences modulo arbitrary powers 
of p. The Riemann—Roch Theorem is actually the first serious aid in making this 
study. The present chapter therefore does not constitute such a study; it merely 
prepares one for such a study. In addition, there is a side benefit to understanding 
the number theory that arises this way: the methods and results of this subject 
and of algebraic number theory have enough in common that the methods and 
results for each suggest methods and results for the other. 

Anespecially tantalizing example of this phenomenon concerns zeta functions. 
The zeros with O < Res < 1 for the Riemann zeta function, which is the 
meromorphic continuation to C of ¢(s) = 0°, n> = ge jimh yore a 
influence the error term in the distribution of the primes as asserted by the Prime 
Number Theorem. The classical Riemann hypothesis is the statement that the 
only such zeros occur on the line Res = 53 it implies a high level of control of 
this error term. There is a corresponding zeta function for any algebraic number 
field, and to it corresponds a version of the Riemann hypothesis appropriate for 
prime ideals for the number field. Proofs or counterexamples for these versions 
of the Riemann hypothesis have been sought for more than a century. 

Meanwhile, one can formulate a Riemann hypothesis for any function field 
in one variable over any finite field, and again the statement has consequences 
for the distribution of prime ideals. This time, however, the Riemann hypothesis 
is a theorem, stated and proved by A. Weil in 1940. One might hope that the 
methods used for Weil’s theorem could shed enough light on the classical Riemann 
hypothesis to lead to a proof, but to date this has not happened. 


Key observation to be used during the study. In the next section we shall 
make systematic use of the following construction for any function field F in 
one variable over the field k. If x is any element of F transcendental over k, 
then the only discrete valuations of F defined over k that take a nonzero value 
on x may be described as follows. Let R be the integral closure of k[x] in F, 
and let R’ be the integral closure of k[x~!] in F. Then R and R’ are Dedekind 
domains by Corollary 7.14, whether or not F is a separable extension of k(x). 
Both have F as field of fractions. Let the ideals xR of R and x~!R’ of R’ have 
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prime decompositions xR = P'--- Pe and x~!R’ = “ tee OF Then the 
valuations vp, for 1 <i < g and vO, for 1 < j < g’ defined by P; and Q; have 
up (x) = e and vg, (x) = =e, , and no other discrete valuation of F that is defined 
over k takes a nonzero value on x. This observation follows from Corollary 6.10 
and the definition of the discrete valuation associated with a nonzero prime ideal 
in a Dedekind domain. 


2. Divisors 


Let k be a field, and let F be a function field in one variable over k. The first step 
is one of normalization: there is no loss of generality in replacing k by the larger 
field k’ of all elements F that are algebraic over k.° 


Proposition 9.1. Let F be a function field in one variable over k, and let k’ be 
the subfield of all elements in F algebraic over k. If x is in F*, then every discrete 
valuation of F defined over k vanishes on x if and only if x is ink’. Consequently 
F is automatically a function field in one variable over k’, and as such, its discrete 
valuations defined over k’ coincide with its discrete valuations defined over k. 


Proof. If x € F is transcendental over k, then the observation at the end 
of Section 1 produces discrete valuations of F defined over k that take nonzero 
values on x. Conversely if x € F* is algebraic over k, we argue by contradiction. 
We may assume that x 4 0. Suppose that v is a discrete valuation of F defined 
over k such that v(x) 4 0. Possibly replacing x by x~!, we may assume that 
u(x) > 0. Being nonzero algebraic over k, x satisfies a polynomial equation 


Gk” hag x bee tax tee =0 
with all a; € k and with ag 4 0. For each j with aj ¢ 0, we have v(ajx/) a 
v(aj) + ju(x) = ju(x) > 0. If aj = 0, then v(ajx/) = co > 0. Thus 
V(X" +x"! +++++a,x) > 0. Since v(ao) = 0, property (vi) of discrete 
valuations in Section VI.2 shows that 
V((amx”" + mx" | + +++ + a1x) +9) = v(ao) = 0 # 0 = v(0), 
contradiction. 

The conclusions in the last sentence of the proposition now follow: Since F 
is generated over F by finitely many elements x,,...,x,, it is generated over 
k’ by the same elements. Moreover, any element of F transcendental over k is 
transcendental over k’, since k’ is algebraic over k. Thus F is a function field in 
one variable over k’. The first paragraph of the proof shows that every discrete 
valuation of F defined over k is defined over k’, and the converse statement is 
immediate from the definition. 


©The field k’ is called the field of constants by some authors. 
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In accordance with Proposition 9.1, there is no loss of generality in replacing 
k by k’ throughout. Changing notation, we assume henceforth that F is a function 
field in one variable defined over k and that every element of F not in k is 
transcendental over k. These hypotheses will not be repeated for each result. 

Suppressing k in the notation, we denote by Vs the set of all discrete valuations 
of F defined over k. A divisor is any member of the free abelian group Dy on 
Vy. Elements of Dy will be written additively,’ and thus a typical member of Dr 


is 
A= a NyV 


veVr 


with only finitely many of the integers n, nonzero. We write ord, A for the integer 
ny, calling it the order of A at v. The identity element of Dr is called zero and 
is denoted by 0. 

Each x in F™ defines a principal divisor (x) by the formula 


(x) = > v(x)v. 


veVr 


We verify that (x) is indeed a divisor by showing that u(x) is nonzero for only 
finitely many vin Vg. Forx ink, v(x) = O forall v. All other x are transcendental 
over k, and the observation at the end of Section 1 shows that exactly g + g’ 
members of Vp are nonzero on x, where g and g’ are certain positive integers 
depending on x. 

It is sometimes convenient to decompose (x) as a particular difference of two 
divisors, writing (x) = (x)o — (X)oo with 


()o= DI vv and oo = DY Cvla))v. 
nes nave 


This notation is motivated by the interpretation of (x) for the case k = C, which 
is discussed in an example below. 

Because of the formula v(xy) = v(x) + v(y), the set of principal divisors is a 
subgroup Pr of Dg, and the mapping x +> (x) is a group homomorphism of F” 
onto Pr. The quotient Cr = Dy/ Pr is called the group of divisor classes of F 
over k. 


EXAMPLE. k = C. This is the setting of a compact Riemann surface, provided 
we take for granted that every compact Riemann surface can be realized as a 
nonsingular projective curve over C. The field F is the field of global meromorphic 


7Some authors use a multiplicative notation. 
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functions on the surface. A principal divisor can be viewed as a compilation of 
the orders of the zeros and poles of a nonzero global meromorphic function: each 
member of Vz corresponds to a point of the surface, and the order of a principal 
divisor (x) with x € F* at a point is positive if the meromorphic function x has 
a zero at the point, negative if x has a pole there. It is known that the sum of the 
orders of all the zeros of a nonzero global meromorphic function equals the sum 
of the orders of all the poles. In the current framework the statement is that the 
sum over u(x) is 0 for every x € F* whenk = C. 


Theorem 9.3 will generalize the fact about compact Riemann surfaces that 
evev, VU) = 0 for every x ¢ F* when k = C. When C is replaced by a more 
general field that is not necessarily algebraically closed, Proposition 6.9 already 
shows that the terms v(x) in the corresponding sum have to be weighted by certain 
integers in order to yield sum 0. These integers are dimensions that are shown to 
be finite in the next proposition. 


Proposition 9.2. Let v be any discrete valuation of F defined over k, let R, be 
the valuation ring, and let P, be the valuation ideal. Then R, and P, are k vector 
spaces, and dim, R,/ Py is finite. 


REMARKS. The integer f, = dim, R,/P, is called the residue class degree of 
the valuation v. The proof gives a method for computing f,, and we shall make 
use of this method shortly in proving Theorem 9.3. 


PROOF. The fact that R, and P, are k vector spaces is immediate from 
Proposition 9.1. Since v is not identically zero, there exists some x € F with 
v(x) # 0, and x is transcendental by Proposition 9.1. Possibly replacing x by 
x!, we may assume that v(x) > 0. The observation at the end of Section 1 
classifies those members of Vp taking positive values on x. In that notation we 
decompose (x)R as P;!--- Pe , and v is the valuation defined by P; for some j. 
Theorem 6.5e shows that R,/P, = R/P;. Since x is prime in k[x], the general 
theory of extensions of Dedekind domains shows that P; Nk[x] = xk[x] and that 
fj = dimysxj/(x)(R/P;) is finite. The field k[x]/(x) is isomorphic to k, and thus 
the dimension over k of R,/P, = R/P; is fj. 


The degree of a divisor A is the integer deg A = ee Ff, ordy(A), where 
fv is the residue class degree of v as defined in the remarks with Proposition 
9.2. Degree is a homomorphism of Dy into Z. We shall prove in Theorem 
9.3 that principal divisors have degree 0. This result extends Proposition 6.9, 
which handles the special case of the function field k(x). Theorem 9.3 may be 
regarded as a function-field analog of the Artin product formula (Theorem 6.51) 
for number fields, but the proof is much easier for function fields because we can 
take advantage of the observation at the end of Section 1. 
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Theorem 9.3. The degree of every principal divisor is 0. In more detail, if (x) 
is a principal divisor with x not in k, then deg(x)o = deg(x)oo = dimy,) F, and 
hence deg(x) = deg(x)o — deg(x)oo = 0. 


PRooF. If x is in k*, then Proposition 9.1 shows that v(x) = O for every 
v € Vz, and hence deg(x) = 0. Thus we may assume that x is transcendental 
over k. Applying the observation at the end of Section 1 and using the notation 
from there, we know that the only v’s for which v(x) # 0 are the ones relative to 
the prime ideals P; of R and the prime ideals Q; of R’ such that 


xR=PO..-P& and x R= QS. QA. (x) 


Moreover, vp,(x) = e; and vo; (x) = =é;. In addition, the proof of Proposition 
9.2 showed that the respective residue class degrees are the usual indices f; and 
f; associated to the decompositions («). Thus 


g g' 
deg(x)o = > fiei and deg(x)o = > fje;. 
i=l j=l 


Two applications of Theorem 9.60 of Basic Algebra show that 


y 


g 
, fei = dimy(,) F and dX fe; = dimyy,-1 F. 
j= 


as} 


l 


ll 


Thus deg(x)o = dimy,,) F, and deg(x)o. = dimy,,-1) F. The theorem therefore 
follows from the fact that k(x) = k(x7!). 


Let Do be the subgroup of all divisors of degree 0. Theorem 9.3 shows 
that Pr C Dro. The quotient Cro = Dro/Pr is therefore a subgroup of 
Cr = Dy/Pr and is the group of all divisor classes of degree 0. This is a 
function-field analog of the class group for an algebraic number field; it can be 
shown to be finite if k is a finite field but it not if k is an arbitrary field. 


3. Genus 


In this section, F denotes a function field in one variable over a field k, and we 
assume that every element of F outside k is transcendental over k. We continue 
with the notation Vz, Dr, fy,ord, A,deg A, and (x) forx € F*,allasin Section 2. 

If we were studying only what happens with k = C, we would be interested 
in the vector space of all meromorphic functions whose poles are limited to a 
certain finite set of points and are limited to some particular order at each of those 
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points. The underlying compact Riemann surface is an ordinary closed orientable 
2-dimensional manifold, and the dimensions of these spaces of meromorphic 
functions turn out to control the genus of this manifold. For general k, we study 
the natural generalization of this situation.’ The vector spaces of interest are 
defined in terms of divisors, and we will be led to a natural definition of genus of 
the curve under study. 

We introduce a partial ordering on Dy by saying that two divisors A and B 
have A < B if ord, A < ord, B for all v € Vg. The inequality B > A is to 
mean the same thing as A < B. If A < Band A’ < B’, then A+ A’ < B+ B’ 
because ord,(A + A’) = ord, A+ ord, A’ < ord, B + ord, B’ = ord,(B + B’). 
If A < B,then—A > —B. 

For each divisor A, we shall study the k vector space 


L(A) = {0} U {x € F* | (x) => —A} = {x € F | v(x) = — ord, A}. 


For x 4 0, we can think of v(x) as telling the order of the zero of x at a point 
corresponding to v. In that spirit, if A > 0, then L(A) consists of all functions 
whose poles are limited to the set of v’s for which ord, A 4 0, with the order of the 
pole bounded above by the number ord, A. For general A, a similar interpretation 
is valid, except that the members of L(A) are required also to vanish at certain 
points at least to certain orders. 

We shall suppress any name for the function that embeds Vr in Dg. Thus 
for example if vo is in Vp, then L(vo) refers to L(A) for the divisor A such that 
ord,, A = 1 and ord, A = 0 when v F vp. 


Corollary 9.4. L(0) =k, and L(A) = Oif A is anonzero divisor with A < 0. 


PROOF. If A < 0 is nontrivial and if x € F* were to have (x) > —A, then we 
would have deg(x) > — deg A > O, in contradiction to the conclusion deg(x) = 0 
of Theorem 9.3. Thus L(A) = 0. Next, we have 


L(0) = {x € F* | v(x) =O forallx}U U L(-v). 


veVr 


The first term on the right side is k*, and the second term gives 0 by what we 
have just proved. Hence L(O) = k. 


If A < B, then it follows from the definition that L(A) C L(B). We shall 
be interested in how much L(B) increases when B increases. This change is 
measured by what happens to the quotient space L(B)/L(A). The key case is 
that B = A + vo for some vp € Vp, and we treat that in the following lemma. 


8In doing so, we follow the approach in the book by Villa Salvador, Chapter 3, but with different 
notation. 
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Lemma 9.5. If A is a divisor and vo is in Vz, then 
dimy L(A + v9)/L(A) < fyy = deg vo. 


PROOF. Put f = fi, let Ry, be the valuation ring of vo, and let P,, be the 
valuation ideal of vg. Since vg carries F* onto Z, we can choose an element 
y € F* with vo(y) = ord,, (A + v9). 

Let f + 1 members x1, ..., x41 of L(A + vo) be given. We shall produce an 
equation of linear dependence among the cosets x; + L(A), and this will prove 
the lemma. Computation gives 


vo (Xi Y) = v(x) + vo(y) = vo(x;) + ord, (A + v9) = 0 


forl <i < f +1, since x; is in L(A + vo). Hence x;y is in R,,. Since 
dimy(Ry,/ Py) = f, there exist members c),...,c¢+1 of k not all 0 such that 

at ci (Xi y+Py) = Py, i-e.,such that ary cix;y liesin P,,. Then ys CiXj 
lies in y~! P,,, and 


f+ 
vo( 2 cixi) = —vo(y) + 1 = —ord,,(A + vp) +1=—ord,, A. (*) 
i=l 


Since each x; is in L(A + ug), So is Pe c,x;. This fact and (*) together show 
that ey cx; isin L(A), ie., that pea cjx; + L(A) is the 0 coset. This proves 
the desired linear dependence and shows that dim, L(A + v9)/L(A) < f. 


Theorem 9.6. If A and B are divisors such that A < B, then L(B)/L(A) is 
finite-dimensional over k with 


dim, L(B)/L(A) < deg B — deg A. 


Moreover, L(A) and L(B) are separately finite-dimensional over k, and conse- 
quently 
dim, L(B) — deg B < dim, L(A) — deg A. 


REMARKS. We define €(A) = dim, L(A). This is finite by the theorem, and 
the resulting inequality of the theorem is that 


€(B) — deg B < €(A) —deg A. 


PRooF. The first conclusion is immediate from Lemma 9.5 by induction 
on >-,, (ord, B — ord, A). Fixing a reference point vp in Vp and taking A = 
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en p<o(ord, B)v — vo and applying Corollary 9.4 to A, we see that L(A) = 0. 
Therefore the first conclusion specializes to 


dim, L(B) — deg B < — deg A. 
Since dim, L(B) is certainly nonnegative, this inequality implies that L(B) is 
finite-dimensional. Then we can expand the left side of the first conclusion of the 


theorem to obtain 


dim, L(B) — dim, L(A) = deg B — deg A, 


and the proof is complete. 


The theorem identifies (B)—deg B as a quantity of interest when we are trying 
to understand a divisor B. We shall undertake a study of this quantity, beginning 
first with the case of a divisor B equal to a multiple of the pole part (x)oo of a 
principal divisor (x). Recall that the signs are arranged to have (x)oo > 0. 


Lemma 9.7. For each x in F that is not in k, there exists a constant C, such 
that the multiple p(x)oo of (*)oo satisfies 


€(P(x)oo) — deg (P(x) oo) = Cy 


for every integer p. 


PROOF. Applying the observation at the end of Section 1, we form the integral 
closure R of k[x] in F and the integral closure R’ of k[x~'] in F. The discrete 
valuations v for which v(x) < 0 are exactly those arising from prime ideals in 
the prime decomposition of x~'k[x~!], according to Corollary 6.10. Specifi- 
cally the ideal x~'k[x~'] in R’ decomposes as a product O° vee Of and the 
corresponding discrete valuations have vg, Gy) e.. Theorem 9.3 shows that 
deg (X)oo = dimy,) F. 

Let n = dimycx) F. Choose a basis yj,..., yy of F over k(x) consisting of 
members of R. Each v arising from a prime ideal of R has v(y;) = O for 
1 < j <n by Proposition 6.7. The remaining v’s all have v(x) < 0, and 
therefore there exists an integer k > 0 such that v(y;) > ku(x) for 1 < j <nand 
for all these remaining v’s. For this value of the integer k, the elements y1,..., Yn 
all lie in L(kK(x)oo). 

Let m > 0 be arbitrary. The v’s coming from some Qx, i.e., those with 
v(x) < 0, have v(x!) > v(x") whenever 0 < i < m, and the remaining v’s, 
i.e., those with v(x) > 0, all have v(x') > 0 forO0 < i < m. Therefore 
1,x,x7,...,x” all lie in L((x™)o9) = L(M(X)o0). 
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Multiplying, we see that x'y; lies in L(k + m)(X)oo) forO <i < m and 
1 < j <n. These elements x'y; are linearly independent over k, and therefore 


E(k +m)(X)oo) = (m + In = (m $ 1) deg(x)oo. 


Since deg is ahomomorphism from Dy into Z, 
deg ((k + m)(x)oo) = (k +m) deg(x)oo. 
Therefore each m > 0 has 


((k +: m)(X)oo) — deg ((k + m)(X)oo) = (m +1 —k — m) deg(x)oc 
= (1 — k) deg(x).0. 


We have therefore proved that 
£(G()oo) — deg(g(X)oo) = (1 — k) deg) o0 


for all integers q that are sufficiently positive. If p is any integer, we can find qg 
as above with p < qg. Then p(X)oo < g(X)oo, and Theorem 9.6 shows that 


(1 — k) deg(*)oo S &(GM)oo) — deg (9 (*)oo) S &(P™)oo) — deg(p(X)oo). 


This proves the lemma with C, = (1 — k) deg(x) oo. 


Lemma 9.8. If A is any divisor and x is any member of F*, then L((x) +A) = 
L(A) canonically. Therefore €((x) + A) = €(A). In addition, deg((x) + A) = 
deg A. 


PROOF. Define a k linear mapping g : L(A) > F by g(y) = x7'y. This is 
certainly one-one, and its image is contained in L((x)+ A) because any nonzero z 
in L(A) has (z) > —A and then also (x~!z) = —(x)+(z) > —(x)—A. Similarly 
w(y) = xy is one-one and carries L((x) + A) into L(A). By inspection, yp = 1 
and gy = 1. Therefore L((x) + A) and L(A) are canonically isomorphic 
and have the same dimension over k. For the last conclusion, deg((x) + A) = 
deg(x) + deg A = deg A by Theorem 9.3. 


Theorem 9.9 (Riemann’s inequality). For each x in F that is not in k, let gy 
be the integer such that 1 — g, is the largest possible C, with 


€(p(x)oo) — deg (P(X)oo) = Cx 


for every integer p. Then 
(a) the integer g = gy is independent of x, 
(b) gis>0, 
(c) €(A) — deg A > 1 — g for every divisor A. 
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REMARKS. The integer g, in the theorem exists by Lemma 9.7. Once it has 
been proved to be an integer g independent of x, it is called the genus of the 
function field F over k. 


PROOF. We begin by proving (c) with g replaced by g,. Let C, be any integer 
with the property that €(p(x)oo) — deg(p(*)oo) = Cy for all p. If a divisor A 
is given, we can write A = Ao — Ago, where Ap = Do ora, avo (Ordy A)v and 
Aco = Yio, aco (—ordy A)v. Then A < Ao, and Theorem 9.6 shows that 
£(A) — deg A > €(Ag) — deg Ao. Thus it is enough to prove (c) for Ap. Let p be 
any integer > 0. Since Ag > 0, we have p(X)oo — Ao < P(*)oo. Hence a second 
application of Theorem 9.6 shows that 


€(P(x)oo — Ao) — deg (p(x)oc — Ao) = €(P*)oo) — deg (Poo) = Cx. 
Since deg is a homomorphism, this inequality implies that 
€(P(*)oo — Ao) = Cx + p deg(x)oo — deg Ao. 
Fix an integer p large enough for the right side to be positive. For this p, the 
vector space L(p(X)oo — Ao) is nonzero; let y be a nonzero member of it. This 


y has (y) => —(p(%)oo — Ao), and hence p(x) > Ao — (y). A third application 
of Theorem 9.6, in combination with Lemma 9.8, shows that 


€(p(x)oo) — deg (p(X)oo) < &(Ao — (y)) — deg(Ao — (y”)) 
= (Ao) — deg Apo. 


The left side is > C,,, and hence £(Ao) — deg Ap > C,.. Therefore 
€(A) — deg A = Cy (*) 
for every divisor A. Since one choice of C, is 1 — gy, this proves (c). 
Taking A = p(y)oo, we see that the best C, has Cy > Cy. Since the roles of 


x and y can be interchanged, this proves (a). Finally if we take A = 0 in (c) and 
apply Corollary 9.4, we see that 1 — 0 > 1 — g. Thus g > 0. This proves (b). 


EXAMPLES OF GENUS. 


(1) F = k(@) for a transcendental x. In the proof of Lemma 9.7, we have 
n = | and can take y; = 1. Then k = 0, and the proof of the lemma shows that 
the inequality of the lemma holds with C, = (1 — 0) deg(x)., = 1. Therefore 
1—g>C, =1,and g <0. So g = 0 by Theorem 9.9b. 
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(2) F = C[x, y]/(y? — x44 1). This example was discussed in Section 1, and 
we have x~!R! = P; Py) with P; = (x~!,x7*y +1) and Pp = (x7!,x7*y — 1). 
The corresponding valuations therefore have vp, (x) = vp,(x) = —1. Meanwhile, 
the elements 1 and y form a basis of F over k(x). The element 1 has vp, (1) = 
vp, (1) = 0; so 1 is in L(p(*)oo) for every p > 0. Since xy is the sum of a 
generator of P, and a generator of P, x Py lies in R’. Write (79) =/],---T, 
where each J; is a prime ideal in R’. Since x~?y and P, together generate 1, 
P, is not one of the ideals J;. Similarly P, is not one of the J;’s. Thus (y) = 
(x—!)-2(a72y) = (P,P2)-71, +++ I), and we obtain vp,(y) = vp,(y) = —2. 
Hence y lies in L(2(x)..), and we can take k = 2 in the proof of Lemma 9.7. For 
this k, we have C, = (1 — 2) deg(x)o = —2. Therefore 1 — g > C, = —2, and 
g <3. In fact, g = 1 here, as a special case of the next example. Thus a routine 
use of the estimate from Lemma 9.7 has its limitations. 

(3) F = kx, y]/(y?— p(x)), where p(x) is a square-free polynomial of degree 
m and k has characteristic ~£ 2. Then g = sm —lifmisevenand g = $(m —1) 
if m is odd. This computation will be carried out in Problems 12-20 at the end 
of the chapter. 


Theorem 9.9 gives the lower bound of 1 — g for £(A) — deg A for all divisors 
A. There is also an upper bound, with the proviso that L(A) 4 0. 


Proposition 9.10. If A is any divisor such that L(A) #4 0, then 
£(A) —degA <1. 


Hence any divisor A with deg A < —1 has £(A) = 0. 


PROOF. Let y be a member of F™ that lies in L(A). Then every v € Vy has 
v(y) => —ord, A and hence 0 > —ord, A — v(y) = —ord,(A + (y)). This 
inequality says that A + (y) > 0. Then Corollary 9.4 and Theorem 9.6 together 
give 

1 = (0) —deg0 => £(A + (y)) — deg(A + (y)), 


and the right side equals £(A) — deg A by Lemma 9.8. Then 1 — degA < 
£(A) — deg A < 1, and we must have deg A > 0 whenever £(A) > 1. 


4. Riemann—Roch Theorem 


Riemann’s inequality, proved in Section 3, shows that every divisor A satisfies 
£(A)—deg A > 1—g, where g is the genus of the curve in question. The Riemann— 
Roch Theorem, to be proved in the present section, gives an interpretation for the 
difference between the two sides of the inequality. 
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In the classical setting of compact Riemann surfaces, the proof of the Riemann— 
Roch Theorem makes use of meromorphic differential forms, sometimes called 
abelian differentials by complex analysts. Meromorphic differential forms are 
objects that locally look like f(z) dz, where z is a local coordinate and f(z) 
is a meromorphic function, and that fit together to be globally defined on the 
complex manifold. What the formula f(z)dz = g(w)dw for fitting together 
means that in the overlap of the regions for two local coordinates z and w, 
f (2) dz = g(w(z)) ae dz holds and hence f(z) = g(w(z)) om In the language 
of differential geometry, a meromorphic differential form is a meromorphic sec- 
tion of the cotangent bundle of the complex manifold. An important step that 
has to be carried out to make these differential forms useful is to prove a version 
of the Residue Theorem. This theorem says that the sum over all points of the 
manifold of the residues of the differential form is 0, the residue of f(z) dz at 
the point corresponding to z = 0 being the coefficient of z~! in the Laurent 
expansion? of f(z) about 0. Once this theorem is in hand, one can begin to prove 
the Riemann—Roch Theorem. 

In our present setting with the function field F in one variable over k, it is not 
too hard to define an analog of meromorphic differential forms and to establish 
that they behave the way one would expect from differential calculus. In order 
to make use of these forms, one has to prove an analog of the Residue Theorem, 
and doing so requires some hard work. A. Weil discovered that this construction 
could be bypassed and that one could prove the theorem directly. The idea is to 
introduce the tool that differential forms make available and to skip the differential 
forms themselves. 

It is worth understanding this background in a little more detail because oth- 
erwise the proof below may seem very strange indeed. To fix the ideas for this 
background only, suppose that the base field k is algebraically closed. Let us 
recall that elements of Vr are meant to correspond to points of a zero locus in 
projective space, at least when the curve is everywhere nonsingular. We write 
this correspondence as v +» p(v). A local coordinate about p(v) is denoted 
by a symbol like z classically, and in the setup with valuations, it is simply a 
member of the valuation ideal of v with v(z) = 1. A differential form that is 
given locally by classical expressions like f(z) dz attaches to each v in Vp the 
function g, +> Residue, (g,f dz), where g, is any Laurent expansion about 
p(v). 

Classically this Laurent expansion is to be convergent in some deleted neigh- 
borhood of p(v), and it involves only finitely many negative powers of the 
local coordinate. The assumption that it converges is not important because 
if v(f) =n, then the only powers of z whose coefficients in g, affect the residue 
at p(v) are the k™ powers for k +n < —1. Thus the assumption on g, is that it is 


°One has to show that this coefficient is independent of the choice of the local coordinate. 
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a member of the Laurent series field k((z)). To compute the residue for g, f dz, 
we need to know how to interpret f(z) as a Laurent series about p(v). Let R, 
be the valuation ring of v, and let P, be the valuation ideal. The field R,/P, is 
a finite extension of k and must be isomorphic to k because k is algebraically 
closed. For each c € k, choose a member a, € R, such that the coset a + P, 
corresponds to c; we may assume that aj = 0. Denote the set of these elements 
de by Ry. If vif) = n, then h = z~"f is in R,, and thus some unique ap in 
Rx has the property that h — ao is in P,. Hence z!(h — ao) is in Ry, and some 
unique a; in Ry has the property that zl(h — ay) — ay is in Py. From this, 
z-!(z~!(h — ao) — a1) is in Ry, and we can continue to subtract members of 
Rx and divide by z in this way. The result is that h = ag + aiz + ag eee 
in the sense that v(h — ay) — ayz — --- — agz*) > k +1 for every k. Therefore 
f=2Zh =z" (agtayz+anz*+---). If we replace each a, by the corresponding 
member c; of k, then z”(co + ¢1z +227 +--+) is the member of k((z)) that we 
associate to f. 

With this identification in place, we can regard the given differential form as 
yielding a k linear function 


Residue : I] k((z)) = I] k. 


veVpr veVpr 


We want to cut down the domain of this mapping so the sum of the residues is 
meaningful for every member of the image. The local expressions f(z) dz involve 
only finitely many poles in a neighborhood of each point, and compactness implies 
that there are only finitely many such points globally. Except at these points the 
residue of g, f dz can be nonzero only if g, has a pole at p(v). Thus we can 
ensure that the sum of the residues is meaningful if we assume that v(g,) > 0 
except for finitely many v. 

For algebraic purposes the domain is still unnecessarily large. Since each 
local coordinate in the algebraic realization is actually a member of F, the only 
members of k((z)) that we need to handle at each point are the members of F. 
So let Az = [], cy, F, and let Ap be the k subspace of all members {g,} of the 
product such that v(g,) < 0 only finitely often. Then the differential form gives 
us a k linear functional 


Sum of Residues : Ap — k. 


We have seen that if the differential form is given by f (z) dz locally near p(v) and 
if v(gy) => —v(JS), then the residue is 0 at p(v). Hence there is some divisor A, 
depending on the differential form, such that if v(g,) => — ord, A forall v € Vr, 
then all residues are 0 and the sum of the residues is 0. Consequently the kernel 
of the sum-of-residues map associated to the differential form contains all tuples 
{gy} of Ap such that v(g,) > — ord, A for this divisor A and all v. 
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Finally there is one more classical fact to bring into play. This is the Residue 
Theorem itself, saying that the sum of the residues is zero for any meromorphic 
differential form. If {g,} is actually a constant tuple with g, = h forsomeh e€ F, 
then the sum-of-residues map as defined above is giving us the classical sum of 
residues for the product of h and the given differential form. This sum is zero. In 
other words, every member of the diagonally embedded F in Ag lies in the kernel 
of the sum-of-residues map associated to the differential form. 

Weil’s idea in a nutshell is that instead of developing differential forms, working 
with residues, and proving the consequence of the Residue Theorem, one should 
just start with any abstract linear functional on Ag that satisfies the conditions 
that we noted above. Then the Riemann—Roch Theorem drops out fairly easily. 
This is the approach we shall follow. The abstract kind of linear functional on 
Ar will be called a “differential” in what follows, as a reminder of the classical 
object that lies behind it.!° 


Without further ado, we proceed with the Riemann—Roch Theorem. In this 
section, F denotes a function field in one variable over a field k, and we assume 
that every element of F outside k is transcendental over k. We continue with the 
notation Vr, Dr, f,, ord, A, deg A, and (x) for x € F*, all as in Sections 2-3, 
and with the notation L(A) and €(A) as in Section 3. If A is a divisor, we let 


5(A) = £(A) — deg A — (1 —). 


Riemann’s inequality (Theorem 9.9) implies that 5(A) > O for all A’s and that 
5(A) = 0 for some A’s. We seek an interpretation of 5(A). 

Let Aj, be the ring of all functions from Vy into F, with the operations taken 
pointwise. It is customary to write such a function as v +> &, rather than as 
v > &(v). Let Ag be the subring!’ of all members & of A% such that u(é,) < 0 
for only finitely many v in Vy. We shall treat Ap as an infinite-dimensional 
associative k algebra with identity. 

Consider the diagonal map A : F > Ag defined by the formula A(x), = x for 
all x €¢ F. Under this map, the member x of F goes to the function whose value 
at each v is x. The reason that A(x) is in Ag and not just Aj is that v(x) < 0 for 
only finitely many v € Vg. The map A is a one-one k algebra homomorphism. 


'0Weil’s argument dates to 1935. It appears in book form in Weil’s Basic Number Theory, where 
the details are carried out when k is a finite field and where comments are made for general k. Lang 
simplified Weil’s argument and wrote it down for algebraically closed fields k in his Introduction 
to Algebraic and Abelian Functions. A version of this argument for general k appears in Villa 
Salvador’s book. The present exposition benefits from all three of these books. 

'lRor readers familiar with Section VI.10, the notation is intended to hint at “adeles” of F. 
However, completions and topologies will play no role in the construction. 
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For each divisor A, define 
L(A) = {& € Ag | v(&) = —ord,(A)}. 
It is immediate from the definitions that 
L(A) N ACF) = A(L(A)). 
Let us see that 
A<B if and only if L(A) C L(B). 


In fact, the “only if” part of the statement is evident. Conversely suppose that 
L(A) C £(B). Choose for each v € Vg an element zy in F with v(z,) = 1. The 
function €4 : Vp — F defined by (4)) = 77° 4 has v((€4),) = —ord, A 
and lies in Ap, since ord, A is nonzero for only finitely many v. The definitions 
show that €,4 lies in £(A), hence in £(B). Thus —ord,(A) = v((E4),) = 
—ord, B, ord, A < ord, B, and A < B. This proves the “if” part of the 
displayed equivalence. If we apply the equivalence twice, we see that 


A=B if and only if L(A) = L(B). 


Let us take note of two operations on divisors A and the effect of these oper- 
ations on the spaces £(A). If A and B are divisors, we define C = min(A, B) 
pointwise by the formula ord, C = min(ord, A, ord, B). Then C is a divisor 
with C < AandC < B. Thus £(C) C L(A) and L(C) C L(B), and we 
consequently obtain 


L(min(A, B)) € L(A) NL(B). 


Still with A and B as divisors, we define C = max(A, B) pointwise by the 
formula ord, C = max(ord, A, ord, B). Then A C C and B C C, from which 
we obtain £(A) C L(C) and £L(B) C L(C). This proves the inclusion C in the 
identity 

L(A) + £(B) = L(max(A, B)). 


To prove >, let € be in £(max(A, B)). We shall decompose & as asum 7 + ¢ in 
L(A) + £(B) with one of 7, and ¢, equal to 0 for each v. Let v be given. Since 
€ isin £(max(A, B)), v(é&) > — ordy(max(A, B)) = — max(ord, A, ord, B). 
That is, —v(&,) < max(ord, A, ord, B). If —v(é,) < ord, A, then define n, = é, 
and ¢, = 0; otherwise, we have —v(é,) < ord, B, and we define 7, = O and 
ty = &. Then v(n,) > — ord, A for all v, and v(¢,) > —ord, B for all v. This 
proves > in the displayed formula. 
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Lemma 9.11. If A and B are divisors with A < B, then 
dim, (C(B)/L(A)) = deg B — deg A. 


PROOF. Proceeding inductively, we see that it is enough to handle the case that 
B= A+ vo, where vo is in Vg. Thus we are to show that 


dim, (L(A + v0)/L(A)) = fry = deg(vo). (x) 


Put f = fy. let R,, be the valuation ring of vo, and let P,, be the valuation ideal 
of vo. To prove < in (*), we argue as in the proof of Lemma 9.5. Since vp carries 
F™* onto Z, we can choose an element y € F* with uvp(y) = ord,,(A + v9). 

Let f + 1 members €“, ...,€%* of L(A +4 v0) be given. We shall produce 
an equation of linear dependence among the cosets & + L(A), and this will 
prove < in (*). Computation gives 


vo (EM y) = v9 (EM) + v9(y) = vo EW) + ord, (A + vo) = 0 


for 1 < i < f +1, with the inequality at the right holding because &") is 
in L(A + uo). Hence E®y is in R,,. Since dimi (Rr / Poo) = f, there exist 
members cj, ..., Cf+1 Of k not all 0 such that Dia Ci EMy + Py) = Pry, 1e., 
such that yo i city lies in P,,. Then se es Ew) lies in y~! P,,, and 


f+ 
vo( > cE) = —vo(y) + 1 = — ord, (A + v9) +1 = —ord, A. (4x) 
i= 
Since each ae is in L(A + v9), So is ey 1 ci This fact and (+) together 
show that ) ft a CEO i isin £(A),ic., that ae 1 ci +L(A) is the 0 coset. This 
proves the desired linear dependence and shows that dim, £(A + v9)/L(A) < f. 
To prove > in (*), we shall produce # members & of L(A + vo) that are 
linearly independent modulo L(A). We begin by choosing 7 in £(A) with 
Vo(M») = —ord,, A. (For example take any member 7’ of L(A), change 7}, 
to a new value on which vp takes the value — ord,, A, and leave 7’ unchanged at 
all other v.) Let x1, ..., xf be a set of representatives in R,, of the f members of 
ak basis of the quotient R,, / P,,, and let z,, be a member of F with vo(z,,) = 1. 
Define €) for 1 < j < f by 


e) = for v £ vo, 
Nu XjT yy, A for v = vo. 
For each j, we have 


Vo (Ny Xj!) = Vo(Ny) + VA) — VOC) 
= —ord, A+ v(xj) — 1 = —ord, A—1, 
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and thus € is in L(A + vp). To prove the linear independence modulo L(A), 
suppose that c),..., cy are members of k such that yy cj& is in L(A). In 
this case we have an inequality vo( eG cj€ ») > —ord,, A, which expands out 
as 


i 
vo( |  CiMowXiT') > vo). 


j 
Since ug (%, ') = —1, subtraction of up (y,) from both sides yields vo( Pe Cjx i) 


> 1. Therefore a cjx; lies in P,,. By the assumed linear independence over 


k of the x;’s modulo P,, , all the c;’s are 0. Therefore the elements €” are linearly 
independent modulo £(A), and the proof of > in () is complete. 


Lemma 9.12. If A and B are divisors with A < B, then there is an exact 
sequence in the category of k vector spaces given by 


0 —> L(B)/L(A) ay L(B)/L(A) 
— + (L(B) + A(F))/(L£(A) + ACF) — 0. 
Consequently 


dim, (£(B) + A(F))/(L(A) + A(B)) = (€(A) — deg A) — (€(B) — deg B) 
= (A) — 8(B). 


PROOF. The map yw is induced by the map A : L(B) > L(B) followed by 
passage to the quotient. It descends to L(B)/L(A) because A(L(A)) C L(A), 
and it is one-one because A(L(B)) 9 L(A) C L(A). The map ¢ is induced 
by the map x +> x + A(F) followed by passage to the quotient. It descends 
to £(B)/L(A) because £(A) maps into £(A) + A(F), and it is onto because 
x K x + A(F) carries £(B) onto £(B) + A(F). The composition gy is 0 
because L(B) maps under A into A(F), which lies in the 0 coset. 

To prove the exactness, let € + £(A) be in ker gy. This condition means that & 
is in £(B) and has € + A(F) in £(A) + A(F). Thus there exists 7 in £(A) with 
€ — nin A(P). Since é and n are in £(B), € — nisin £L(B)N A(F) C A(L(B)). 
Hence € + L(A) = (€ — n) + L(A) lies in A(L(B)) + L(A) = image w, and 
exactness is proved. 

From the exactness we obtain 


dim; £(B)/L(A) = dim L(B)/L(A) + dim,(L(B) + A(P)) /(L(A) + A®)). 


The left side equals deg B — deg A by Lemma 9.11, and the first term on the right 
side equals €(B) — €(A) by the finite dimensionality of L(B) and L(A), which 
was proved as part of Theorem 9.6. The result follows. 
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Theorem 9.13. There exists a divisor C such that Ag = £(C) + A(F). For 
each divisor A, 


5(A) = dimy (Ap/(L(A) + A(®))). 


PROOF. Riemann’s inequality produces a divisor C, specifically any suffi- 
ciently large positive power of a divisor (X)oo, such that 6(C) = 0. If we can 
show that Ag = £(C) + A(F), then the dimensional equality in Lemma 9.12 
with B = C will complete the proof of the present theorem. 

Suppose that there exists a member & of Ap that is not in £(C) + ACF). For 
each v € Vy, let ay = min(v(&y), — ordy C), and define C’ = —) cy, au. 
Since & is in Ap, only finitely many integers v(&,) are negative. This fact and 
the fact that C is a divisor together imply that only finitely many a, are negative. 
Since C is a divisor, only finitely many integers — ord, C can be positive, and 
thus only finitely many a, can be positive. Therefore C’ is a divisor. 

The definition of C’ is arranged in such a way that C < C’. Also, every v has 
v(&y) => ay = — ord, C’, and hence é lies in £(C’). Consequently 


dim, (L(C’) + A@))/(L(C) + A(P)) = 1. 
By Lemma 9.12, 5(C) — 6(C’) = 1. Since C was assumed to have 6(C) = 0, we 


obtain —d(C’) > 1, in contradiction to the fact that 6(A) > 0 for every divisor 
A. We conclude that every & in Ap lies in £(C) + A(F). 


Theorem 9.13 gives a first interpretation of the difference 6(A) between the 
two sides of Riemann’s inequality (Theorem 9.9). We shall now apply Theorem 
9.13 and reinterpret 5(A) as the dimension £(B) of a suitable divisor B obtained 
from A, and then we will have obtained the Riemann—Roch Theorem. 

A differential of F is a k linear functional w on Ag with the property that w 
vanishes on £(A) for some divisor A and w vanishes also on A(F). The set of 
all differentials of F will be denoted by Diff(F). Let us observe that Diff(F) is 
a vector subspace of k linear functionals on Ag. Scalar multiplication by k is 
not an issue. To see that Diff(F) is closed under pointwise addition, let @ and 
w’ be differentials vanishing on £(A) and £(B), respectively. We have seen that 
L£(min(A, B)) C L(A) N L(B). Thus w 4+ a’ vanishes on £(min(A, B)). Since 
@ +a’ vanishes also on A(F), @ + a’ is a differential. 

The k vector space of differentials vanishing on £(A)+A(K) may be identified 
with the vector space of k linear functionals on the quotient Ap/(£(A) + A(F)), 
and the latter space is finite-dimensional of dimension 6(A) by Theorem 9.13. 
Since a finite-dimensional vector space and its dual have the same dimension, the 
k vector space of differentials vanishing on £(A) + A(K) has k dimension 6(A). 

In addition, Diff(F) carries a scalar multiplication by F that makes it into an 
F vector space. What is required to verify this statement is a definition, and then 
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the verification of the properties of an F vector space is routine. If y is in F and 
w is a differential, we define yw on Ag by (yw)(€) = w(A(y)é). The linear 
functional yw vanishes on A(F) because A is a homomorphism. It is enough to 
check for y # 0 that 


if w vanishes on £(A), then yw vanishes on L(A + (y)), 


where (y) is the principal divisor corresponding to y. To prove this vanishing, 
let € be in L(A + (y)). Then v(é,) = — ordy(A + (y)) = — ord, A — ordy(y) = 
— ord, A—v(y), whichimplies that v(€,y) => — ord, A, which implies that € A(y) 
lies in £(A), which implies that w(€ A(y)) = 0, which implies that (yw) (€) = 0. 
This proves the asserted vanishing, and it follows that Diff(F) carries a well- 
defined scalar multiplication by F. 

Each set L(A), where A is a divisor, will be called a parallelotope of Ar. 
These sets are large subsets of Ap, since dim, Ag/(£(A) + A(F)) is finite and 
dim, Ag/A(F) is infinite. We are going to associate a particular parallelotope 
to each nonzero differential. Since we have seen that distinct parallelotopes 
correspond to distinct divisors, we shall obtain a way of associating a divisor to 
each nonzero differential. 


Corollary 9.14. If w is a nonzero differential and £(A) is a parallelotope in 
its kernel, then 


@(A)<6(0) and degA<8(0)+g—1. 


Consequently there exists a unique maximum parallelotope on which w vanishes. 


REMARKS. In view of the remarks before the corollary, we therefore obtain a 
function w +> Div(@) from the set Diff(F) — {0} of nonzero differentials into the 
set Dp of divisors. 


PROOF. If we know that £(A) < 6(O), then addition to this inequality of 
Riemann’s inequality deg A — £(A) < g — 1 as given in Theorem 9.9 shows that 


deg A <6(0) + 2-1 


and proves the second inequality. The inequality €(A) < 65(O) is trivial if 
L(A) = 0. 

Therefore we may assume in the two inequalities that L(A) 4 0. Let y be 
any nonzero member of L(A). Since the kernel of w contains £(A), the kernel 
of yw contains £(A + (y)), by a computation made above. Meanwhile, the 
element y, being in L(A), has (y) > —A and hence 0 < A+ (y). Therefore 
L(0) € L(A + (y)), and the kernel of yw contains £(0). Since the kernel of yw 
contains A(F), yw is well defined on the quotient space Ag/(£(0) + A(F)). 
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Now suppose that y1,..., yn is a k basis of L(A). Let us use the fact that 
w # 0 to prove that yiw,..., y,@ are linearly independent when viewed on 
Apg/(£(0) + A(F)): Ifcy, ..., ¢, are members of k not all 0, then z = ae CY; 
is a nonzero member of L(A), and we have just seen that zw is well defined on 
Ap/(£(0) + A(F)). Then we have a= cj(yj@) = (a1 cjyj)@ = zo, and 
this cannot act as 0 on Ag/(£(0) + A(F)) without being identically 0 on Ag. 
Since any & such that w(&)) 4 0 has the property that zw(A(z)~!&) 0, the 
linear functionals yjw, ..., y,@ on Ag/(£(0) + A(F)) are linearly independent. 

We know that 6(0) = dimy, Ag/(£(0) + A(F)) by Theorem 9.13, and hence 


n = (A) < 5(0). 


This completes the proof of the two inequalities. 

We turn to the existence and uniqueness of the maximum parallelotope on 
which w vanishes. We continue to assume that w 4 0. Now suppose that 
A is a divisor such that w vanishes on £(A). Suppose that B is a divisor for 
which B < A fails and for which w(£(B)) = 0. We know that the divisor 
max(A, B) has the property that £(max(A, B)) = L(A) + L(B). Since w 
vanishes on £(A) and £(B), it follows that it vanishes on £(max(A, B)). Since 
B < A fails, there exists some up € Vp with ord,, B > ord,, A, and this vp has 
ord,, max(A, B) > ord,, A. Thus deg max(A, B) > deg A. 

The second inequality proved above shows that the degree is bounded on all 
divisors whose parallelotopes are in ker w. In finitely many steps we consequently 
arrive at a divisor C with £(C) C ker w such that any divisor B with £(B) C kerw 
has B < C. Then C is the unique maximum divisor on whose parallelotope w 
vanishes. The parallelotope determines the divisor, and the proof of the corollary 
is complete. 


Recall from Section 2 that the additive subgroup Pr of principal divisors within 
the group Dr of all divisors breaks Df into equivalence classes known as divisor 
classes. The group Cr = Dr/ Pr is the group of all divisor classes. The operation 
of a principal divisor (y), for y € F*, on a divisor A is At» A+ (y). On the 
other hand, we have seen that if a nonzero differential @ vanishes on £(A), then 
yw vanishes on £(A + (y)). In the notation of the remarks with Corollary 9.14, 
we therefore have 

Div(yo) = Div(@) + (y). 


A single orbit of nonzero differentials under the scalar-multiplication action on 
Diff(F) by F* thus yields a single divisor class within Dg. We shall show that 
Diff(F) is 1-dimensional as an F vector space. Then the nonzero differentials 
form a single orbit under F*, and the divisors that arise as Div(w) for some 
nonzero differential @ form a single divisor class. 
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Lemma 9.15. As a vector space over F, the space Diff(F) of differentials is 
1-dimensional. 


PROOF. First we prove that Diff(F) is nonzero. Referring to Theorem 9.13, 
we know that 6(A) = dim, (Ag/(£(A) + A(F))). If 6(A) > 0, then there exist 
nonzero linear functionals on Ag 7 (L(A) + A(B)), and the lift of such a nonzero 
linear functional to Ag is a nonzero differential. Thus it is enough to produce a 
divisor A with 5(A) > 0. Fix vo in Vg, and let A = —2vo. Proposition 9.10 
shows that £(A) = 0. Therefore 


5(A) = (A) —deg A- (1g) =2+g-1=g+1>0, 


and this A has 6(A) > 0. 

Now we shall prove that the F dimension of Diff(F) is at most 1. Arguing by 
contradiction, suppose that w and ” are differentials that are linearly independent 
over F. If w vanishes on £(A) and w’ vanishes on £(A’), then w + w’ vanishes 
on L(A)N L(A’) D L(C), where C = min(A, A’). Let B be an arbitrary divisor. 
Suppose for the moment that L(B) 4 0. If y # 0 is in L(B), then (y) > —B, 
andC + (y) > C—B. So L(C + (y)) D L(C — B). We have seen that the 
vanishing of w on £(C) implies the vanishing of yw on L(C + (y)). Therefore 
yw vanishes on £(C — B). Similarly yw’ vanishes on L(C — B). 

Still with L(B) 4 0, let n = €(B), and let x1, ..., x, and y,,..., y, be bases 
of L(B) over k. Then x)@,...,%n@, yio'’,..., Yn@" are linearly independent 
over k because a relation 


n n 
» ajxjo+ > bjyjo" =0 
i=l j=1 


would mean that the members x = )7j_, a;x; and y = )°"_, bjy; of F have 
xo + yo! = 0. Since w and w’ are assumed to be linearly independent over F, 
x = y = 0. But then a; = 0 for all i and b; = O for all 7. Consequently we 
can generate 2n linearly independent differentials that all vanish on C(C — B). 
These differentials may be regarded as linear functionals on the k vector space 
Ag/(£(C — B) + A(F)), whose k dimension is (C — B) by Theorem 9.13. 
Consequently 
d(C — B) > 26(B), 


and this inequality is true also if L(B) = 0, by Riemann’s inequality. Substituting 
from the formula for 6(-), we obtain 


£(C — B) —deg(C — B) —1+g > 2¢(B) 


= 2(deg B+ 1 — g) + 6(B)) 
> 2degB+2-—2g 
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because Riemann’s inequality shows that 6(B) > 0. Replacing deg(C — B) by 
deg C — deg B gives 


deg B < £(C — B) —degC —3+3g. (*) 


Proposition 9.10 shows that €(C — B) < 1+ deg(C — B)if €(C — B) £0. In 
this case the two inequalities together give 


2deg B < —2+ 3g; 


hence €(C — B) = 0 if deg B is positive and sufficiently large. Choosing then a 
divisor B with deg B positive and sufficiently large, we have €(C — B) = 0, and 
() gives 

deg B < —degC —3+3g. 


Since the right side is fixed and the left side can be made arbitrarily large, we 
have arrived at a contradiction. 


As a result of Lemma 9.15, the divisors of the form Div(@) for some nonzero 
differential w constitute a single class in the group Cr = Dy / Pr of divisor classes. 
This class is called the canonical class of F, and any divisor in the class is called 
a canonical divisor. 


Theorem 9.16 (Riemann—Roch Theorem). Let F be a function field in one 
variable over a field k, and suppose that every member of F not in k is transcen- 
dental over k. If A is any divisor of F and C is any canonical divisor, then 


(A) =degA+ (1 —g)+€(C — A), 


where g is the genus of F. 


PROOF. Lemma 9.15 shows that there exists a nonzero differential wo. Let 
Co = Div(wo). Lemma 9.15 shows that C = Co + (yo) for some yo € F*. Then 
@ = YoWo has 


Div(@) = Div(yowo) = Div(@o) + (Yo) = Co + (Yo) = C. 


Let B bea divisor to be specified, and consider C — B. Any nonzero differential 
w’ vanishing on £(C — B) is of the form w’ = zw for some z € F* by Lemma 
9.15, and Div(w’) = Div(zw) = C + (z). Therefore £(C + (z)) D L(C — B), 
C + (z) => C — B, and (z) > —B. This inequality means that z is in L(B). 
Conversely if y is any nonzero element in L(B), then (y) > —B andC + (y) = 
C—B.SoL(C+(y))) D L(C — B). We know that yw vanishes on L(C + (y)), 
and hence yw vanishes on L(C — B). 
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Consequently the differentials vanishing on £(C — B) are exactly the dif- 
ferentials yw with y in L(B). Such differentials vanish on A(F) by definition, 
and the space of them is k isomorphic to the space of k linear functionals on 
Ap /(L(C — B)+ A())). By Theorem 9.13 the latter space has k dimension 
56(C — B), and hence the space of differentials in question has k dimension 
6(C — B). In short, 

6(C — B) = £(B). 
Since B is arbitrary, we can specialize it to B = C — A. Then we obtain 
£(C — A) = 5(A) = £(A) — deg A — (1 — g), 


and the theorem follows. 


5. Applications of the Riemann—Roch Theorem 


We begin with some immediate applications of the Riemann—Roch Theorem, and 
then we obtain some applications that require arguments that are a bit more subtle. 
Another application appears in the problems at the end of Chapter X. 


Corollary 9.17. If C is any canonical divisor, then €(C) = g. 


PROOF. Put A = 0 in Theorem 9.16, and use the fact given in Corollary 9.4 
that £(0) = 1. 


Corollary 9.18. If C is any canonical divisor, then degC = 2g — 2. 


PROOF. Put A = C in Theorem 9.16, and apply Corollary 9.17 and Corollary 
94, 


Corollary 9.19. Any divisor A with deg A > 2g — 2 has 6(A) = 0, ie., 
£(A) = deg A+ (1 — g). 

Proof. If degA > 2g — 2, then it follows from Corollary 9.18 that 
deg(C — A) < 0. By Proposition 9.10, €(C — A) = 0. Then the corollary 
is immediate from Theorem 9.16. 


Corollary 9.20. If A is a divisor with deg A = 2g — 2, then either A is a 
canonical divisor and £(A) = g,or A is not acanonical divisor and (A) = g—1. 


PROOF. If A is a canonical divisor, then (A) = g by Corollary 9.17. Other- 
wise, the divisor C — A, which has degree 0 by Corollary 9.18, is not a principal 
divisor. Any nonzero y in L(C — A) then would have (y) > —(C — A) and 
0 = deg(y) > — deg(C — A) = 0; hence v(y) = — ord,(C — A) for all v, and 
(y) = C — A, contradiction. Consequently L(C — A) = 0 and £(C — A) = 0. 
Theorem 9.16 now gives £(A) = deg A+ (1 —g) = Qg—-2)+U-g)=8-1. 
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EXAMPLES OF CANONICAL DIVISORS. 


(1) Genus g = 0. In Corollary 9.20 with g = 0, the alternative £(A) = 
g — 1 = —1 is impossible, and therefore every divisor with degree —2 is a 
canonical divisor. 


(2) Genus g = 1. In Corollary 9.20 with g = 1, take A = 0. Then @(A) = 
1 = g by Corollary 9.4. So Corollary 9.20 says that the divisor 0 is a canonical 
divisor. 


Corollary 9.21. If vo is in Vr and n > max(2g — 1,0), then there exists a 
nonscalar x in F* with (x)o <nvuo. 


PROOF. Let A = nvo, and let f,, be the residue class degree of vg. Then 
deg A= nfy, > n > max(2g — 1,0), and Corollary 9.19 gives 


€(A) = deg A+ (1—g) =nfy + — g) 
> max(2g — 1,0) + (I — g) = max(g, 1 — g) > 1. 
Hence €(A) > 2, and L(A) contains a nonscalar element x. This x has 


—n = —ord,, A < ordy, (x) = ordy, (x)o — Ordy, (X)oo = — OFdy, (X)oo, 


and thus (x)o < nug. 


Doubly periodic meromorphic functions on C in the subject of complex analy- 
sis may be viewed as meromorphic functions on some torus,!? which is a compact 
Riemann surface of genus 1. The Weierstrass go function for the torus in question 
has a double pole at one point, two zeros, and no other poles or zeros. It is therefore 
a function x with (X)o5 = 2vp if vo is the discrete valuation corresponding to the 
location of the pole. Hence this x provides an example with equality holding in 
Corollary 9.21 when g = 1. A theorem of Liouville in this terminology says that 
there is no meromorphic function on the torus having just one simple pole and no 
other poles. The final corollaries abstract this result to our setting, but they need 
an additional hypothesis to ensure that f,, = 1. Certainly f,, will equal 1 if k is 
algebraically closed. We consider g = 1 and g > 1 separately. These corollaries 
will be generalized in Problems 23—25 at the end of the chapter. 


Corollary 9.22. If k is algebraically closed, if vp is in Vg, and if g = 1, then 
every x in F with (x)oo < vo is a scalar multiple of the identity. 


PROOF. Put A = vp. We seek x € F with vo(x) > —1 = —ord,, A and 
with v(x) > 0 = —ord, A for all other v. Thus we seek x in L(A). This A 
has deg A = 1 = g = 2g — 1. By Corollary 9.19, €(A) = deg A+ (1 —- g) = 
1+ (—1) =1. Since L(A) already contains the multiples of the identity, it 
contains nothing else. 


The particular torus is C/A, where A is the lattice of periods. 
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Corollary 9.23. If k is algebraically closed, if vp is in Vg, and if g > 1, then 
every x in F with (x)oo < vo is a scalar multiple of the identity. 


PROOF. We argue by contradiction. Suppose that x is a nonscalar element in 
L(vo). Take r = 2g — 1, and let cj, ..., c, be distinct members of k. For each j 
with | < j <r,x —c; isin L(vg). Since deg(x — c;) = 0, there exists a unique 
vj € Vg with v;(x —c;) = 1. The divisor of the element (x —cj)! is then vp — vj. 
It follows that every k linear combination of the elements (x — or) ae lies in L(A) 
for A = v1; +---+v,. On the other hand, these elements are linearly independent 
because Ul eo 4 a(x — ¢;)"') < 0 if and only if a; 4 0. Thus £(A) > 2¢ — 1 
and deg A = 2g—1. Since deg A > 2g —2, Corollary 9.19 is applicable and gives 
£(A) = deg A+1—g. Thus2g—1 < €(A) = deg A+1—g = 2g—-1+1-g =, 
and we obtain the contradiction g < 1. 


6. Problems 


1. Let F bea function field in one variable over the field k, and let k’ be the subfield 
of all members of F that are algebraic over k. 
(a) Suppose that t),..., tf, are members of k’ that are linearly independent over 
k, and suppose that x € F is transcendental over k. Prove that tj, ..., t, are 
linearly independent over k(x). 
(b) Deduce from (a) that [k’ : k] < [k’(x) : k(x)]. 
(c) Deduce that [k’ : k] < oo. 


Problems 2—4 concern perfect fields, which were defined in Section VII.3. The field 
k is perfect if either it has characteristic 0 or else it has characteristic p and the field 
map x +> x? of k into itself is onto. 

2. Prove that an algebraic extension of a perfect field is perfect. 

3. When k is perfect, refine an argument in Section 1 by making use of Theorems 
7.18, 7.20, 7.22, and the Theorem of the Primitive Element, and show that any 
function field in one variable is the function field of some affine plane curve 
irreducible over k. 

4. Let k be a perfect field. An affine plane curve f(X, Y) irreducible over k is 
nonsingular at a point (a, b) of its zero locus if at least one of aa, b) and 
af (a, b) is nonzero. Using Bezout’s Theorem and taking a cue from the proof of 
Theorem 7.20, prove that the curve can be singular at only finitely many points 
of its zero locus. 


Problems 5-11 seek to attach a discrete valuation of the function field of an irreducible 
affine plane curve to each point of the zero locus at which the curve is nonsingular. 
Let k be a base field, let f(X, Y) be an irreducible polynomial in k[X, Y], let R = 
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kX, Y]/Cf (X, Y)), let x and y be the nna of X and Y in R, and let F be the field 
of fractions of R. Suppose that (a,b) € k* has the pep that f(a, b) = 0. The 
condition of nonsingularity of f at - b) is that one of 2 ay and ; a be nonvanishing at 
(a, b), and it will be assumed that 2f Lia, b) £0. Obewe ane tenis 7.16 that if Sis 
any integral domain, ifs isin S, anditeCOi isin S[X], thenc(X)—c(s) = (X—s)d(X) 
for some d(X) in S[X]. 


5: 


10. 


Let f\(X) be the member of k[X 1 ee as above to make f(X,b) = 
(X — a) f\(X). Using the fact that 26 faa, b) # 0, prove that f;(a) 4 O and 
therefore also that f, (x) 4 0. 


Let g(X, Y) be amember of k[X, Y] with g(x, y) #0. Prove that if g(a, b) = 0, 
then there exist g;(X) in k[X] and h,(X, Y) in k[X, Y] with 


8(X, VY) fi(X) — F(X, Ygi(X) = VY — bh (X,Y), 


and deduce that g(x, y) = (y — b)hi (x, y)/fi(). 
Show that there is a discrete valuation v; of F over k with vj (y — b) > 0. 


If h(a, b) = 0 in Problem 6, then the process can be repeated to give 


g(x, y) = (vy —b)*ha(x, y)/filx)’. 


It can be repeated again if h2(a, b) = 0, and so on. By applying the valuation 
v1 of the previous problem to g(a, y), show that there is an upper bound to the 
integers k > O such that a nonzero member g(x, y) in R can be written in the 
form g(x, y) = (y— b)*hg(x, y)/filxy* for some h(x, y) in R. 


(a) Deduce that each nonzero g(x, y) in R is of the form 


g(x,y) = (y — b)"hG, y)/fix)” 


with n > 0, h(x, y) in R, and h(a, b) ¥ 0, and that the integer n and the 
member h(x, y) of R are uniquely determined by g(x, y). 

(b) Conclude that every nonzero member g(x, y) of the field of fractions F is 
of the form (y — b)"h1(x, y)/ho(x, y) with n in Z, hy (x, y) and h2(x, y) 
nonzero in R,h\(a, b) £0, and ha(a, b) £ 0. 

(c) Prove in (b) that g(x, y) uniquely determines n. 


Write each nonzero g(x, y) in F as in (b) of the previous problem, and put 
v(g) = n. Also, define v(0) = oo. Show that the resulting function v is a 
well-defined valuation of F having R in its valuation ring, taking the value 0 on 
all members of R that are nonvanishing at (a, b), and having all members of R 
vanishing at (a, b) in its valuation ideal. 
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11. Prove that there is only one valuation of F over k taking the value 0 on all members 
of R that are nonvanishing at (a, b) and having all members of R vanishing at 
(a, b) in its valuation ideal. 


Problems 12—20 compute the genus of certain function fields in one variable. Let k 
be a field of characteristic 4 2, let f(X) be a square-free nonconstant polynomial in 
k[X], let F = k(X)[Y]/(Y? — f (X)), and let x and y be the images of X and Y in F. 
In these problems, p denotes a positive integer. 
12. Verify that 
(a) the element x is transcendental over k, y is algebraic over k(x) with y = 
f (x), and F is a function field in one variable over k, 
(b) every member of F is uniquely of the form a(x) + yb(x) with a(x) and b(x) 
in k(x), 
(c) every member of F not in k is transcendental over k, 
(d) F/k(x) is a Galois extension of degree 2, and the nontrivial element o of 
Gal(F/k(x)) satisfies o (a(x) + yb(x)) = a(x) — yb(x) for a(x) and b(x) 
in k(x). 


13. Prove that the integral closure of k[x] in F is the ring R of all elements 
a(x) + yb(x) such that a(x) and b(x) are in k(x). 


14. (a) Deduce from the previous problem that R is the set of all members z of F 
such that v(z) > 0 for all v in Dg that satisfy v(x) > 0. 
(b) Deduce from (a) that L(p(x)oo) CR. 


15. Let v be any member of Dp with v(x) < 0. 
(a) Prove that every nonzero c(x) in k[x] has v(c(x)) = (degc)u(x). 
(b) Prove that u(y) = 4(deg f)v(x). 
(c) Prove that if a(x) and b(x) are in k[x] with degb + 5 deg f < p and 
dega < p, then v(a(x) + yb(x)) = pv(x). 


16. Prove that if a(x) and b(x) are in k[x] with deg b + : deg f < panddega < p, 
then a(x) + yb(x) lies in L(p(X) oo). 

17. (a) Prove that if v is in Dg and if o is in Gal(F/k(x)), then the function v” 

defined by v’ (z) = v(o(z)) for z € Fis in Dp. 
(b) Why is v(x) < 0 if and only if v(x) < 0? 
(c) Deduce that if z is in L(p(x)..), then so is o(z). 

18. (a) Using the previous problem, show that if a(x) and b(x) are in k[x] with 
a(x) + yb(x) in L(p(x)oo) and if v is a member of Dp with v(x) < 0, 
then v(a(x)) > pu(x) and v(a(x)* — f (x)b(x)*) = 2pu(x). Conclude that 
dega < p and deg(a* — fb?) < 2p. 

(b) Deduce that L (p(x) .) consists of all members a(x) + yb(x) of R such that 
dega < p anddegb+ 5 deg f <p. 
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19. Calculate that £(p(x)0.) = 2p +2 —[3(1 + deg f)] if p => [5(1 + deg f)]. 
Here [- ] denotes the greatest integer function. 
20. (a) Why is deg(x).5 = 2? 
(b) Using Corollary 9.19 with A = p(x). for a suitable p, prove that the genus 
of Fis g = [5(1 +deg f)| — 1. 
Problems 21—22 compute the genus of certain further function fields in one variable. 
The notation is as in Problems 12-20 except that f(X) is allowed to have repeated 
factors. Suppose that f(X) = g(X )*n(X), where h(X) is a square-free nonconstant 
polynomial and g(X) is in k[X]. Let F = k(X)[Y]/(¥? — f(X)). 
21. With F’ = k(X)[Z]/(Z* — h(X)), exhibit a field isomorphism F > F’ fixing k. 
22. Suppose that f (X) has degree 3. 
(a) Prove that F has genus | if f(X) has no repeated root in k and that F has 
genus 0 otherwise. 
(b) Prove that the affine plane curve Y* — f (X) over k has a singularity in k?, is if 
and only if f(X) has a repeated root in keig: Here kajg denotes an algebraic 
closure of k. 


Problems 23-25 introduce Weierstrass points. Let k be an algebraically closed field, 
and let F be a function field in one variable over k of genus g. Fix a discrete valuation 
vin Dy. 

23. Why is it true that 2(0v) = 1, €(1v) = Llifg > 1, €(2g — Dv) = g,lQgv) = 
gti,and (mv) < €(m + Iv) < (nv) + | for all integers n > 0? 

24. Deduce from the previous problem that there exist exactly g integers 0 < n; < 
ny < +++ <n, < 2g such that there is no x in F with (x). = njv. (Educational 
note: The integers n; are called the Weierstrass gaps of v, and (11,..., 1) is 
the gap sequence for v. Classically when F is viewed as the function field of 
an everywhere nonsingular projective curve, then the points of the zero locus in 
projective space are in one-one correspondence with the members of Dp; with 
this understanding, the point corresponding to v is called a Weierstrass point if 
the gap sequence for v is anything but (1, 2,..., g). Accordingly let us call va 
Weierstrass valuation in this case.) 

25. Prove that 
(a) vis a Weierstrass valuation if and only if €(gv) > 1. 

(b) 1 is a Weierstrass gap if g > 0. 

(c) vis not a Weierstrass valuation if g = Oorg = 1. 

(d) ifr and s are positive integers with sum < 2g that are not Weierstrass gaps 
at v, thenr +s is not a Weierstrass gap at v. 

(e) if2is nota Weierstrass gap at v, then the gap sequence is (1,3, 5,...,2g—1). 


CHAPTER X 


Methods of Algebraic Geometry 


Abstract. This chapter investigates the objects and mappings of algebraic geometry from a geo- 
metric point of view, making use especially of the algebraic tools of Chapter VII and of Sections 
7-10 of Chapter VIII. In Sections 1-12, k denotes a fixed algebraically closed field. 

Sections 1-6 establish the definitions and elementary properties of varieties, maps between 
varieties, and dimension, all over k. Sections 1-3 concern varieties and dimension. Affine algebraic 
sets, affine varieties, and the Zariski topology on affine space are introduced in Section 1, and 
projective algebraic sets and projective varieties are introduced in Section 3. Section 2 defines 
the geometric dimension of an affine algebraic set, relating the notion to Krull dimension and 
transcendence degree. The actual context of Section 2 is a Noetherian topological space, the Zariski 
topology on affine space being an example. In such a space every closed subset is the finite union of 
irreducible closed subsets, and the union can be written in a certain way that makes the decomposition 
unique. Every nonempty closed set has a meaningful geometric dimension. In affine space the 
irreducible closed sets are the varieties, and each variety acquires a geometric dimension. The 
discussion in Section 2 applies in the context of projective space as well, and thus each projective 
variety acquires a geometric dimension. Moreover, any nonempty open subset of a Noetherian 
space is Noetherian. A nonempty open subset of an affine variety is called quasi-affine, and a 
nonempty open subset of a projective variety is called quasiprojective. Each quasi-affine variety or 
quasiprojective variety has a dimension equal to that of its closure, which is a variety. 

Sections 4—6 take up maps between varieties. Section 4 introduces spaces of scalar-valued 
functions on quasiprojective varieties —rational functions, functions regular at a point, and functions 
regular on an open set. The section goes on to relate these notions for the different kinds of varieties. 
Section 5 introduces morphisms, which are a restricted kind of function between varieties. The 
tools of Sections 4—5 together show that for many purposes all the different kinds of varieties can be 
treated as quasiprojective varieties. Section 6 introduces rational maps between varieties; these are 
not everywhere-defined functions, but each can be restricted to an open dense subset on which it is 
a morphism. Rational maps with dense image correspond to field mappings of the fields of rational 
functions, with the order of the mappings reversed. 

Section 7 concerns singularities at points of varieties, still over the field k. Zariski’s Theorem 
was stated in Chapter VII for affine varieties and partly proved at that time. In the current context 
it has a meaning for any point of any quasiprojective variety. The section proves the full theorem, 
which characterizes singular points in a way that shows they remain singular under isomorphisms 
of varieties. 

Section 8 concerns classification questions over k for irreducible curves, i.e., quasiprojective 
varieties of dimension |. From Section 6 it is known that two irreducible curves are equivalent under 
rational maps if and only if their fields of rational functions are isomorphic. The main theorem of 
Section 8 is that each such equivalence class of irreducible curves contains an everywhere nonsingular 
projective curve, and this curve is unique up to isomorphism of varieties. The points of this curve 
are parametrized by those discrete valuations of the underlying function field that are defined over k. 
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Sections 9-12 relate the general theory of Sections 1-6 to the topic of solutions of simultaneous 
solutions of polynomial equations, as treated at length in Chapter VIII. Section 9 treats monomial 
ideals in kLX1,..., Xn], identifying their zero loci concretely and computing their dimension. The 
section goes on to introduce the affine Hilbert function of this ideal, which measures the proportion of 
polynomials of degree < s not in the ideal. In the way that this function is defined, it is a polynomial 
for large s called the affine Hilbert polynomial of the ideal. Its degree equals the dimension of the 
zero locus of the ideal. Section 10 extends this theory from monomial ideals to all ideals, again 
concretely computing the dimension of the zero loci, obtaining an affine Hilbert polynomial, and 
showing that its degree equals the dimension of the zero locus of the ideal. Section 11 adapts the 
theory to homogeneous ideals and projective algebraic sets by making use of the cone in affine 
space over the set in projective space. Section 12 applies the theory of Section 11 to address the 
question how the dimension of a projective algebraic set is cut down when the set is intersected with 
a projective hypersurface. A consequence of the theory is the result that a homogeneous system of 
polynomial equations over an algebraically closed field with more unknowns than equations has a 
nonzero solution. 

Section 13 is a brief introduction to the theory of schemes, which extends the theory of varieties 
by replacing the underlying algebraically closed field by an arbitrary commutative ring with identity. 


1. Affine Algebraic Sets and Affine Varieties 


We come now to the more geometric side of algebraic geometry. At least initially 
this means that we are interested in the set of simultaneous solutions of a system 
of polynomial equations in several variables. Because of the Nullstellensatz the 
natural starting point for the investigation is the case that the underlying field of 
coefficients is algebraically closed. 

Accordingly, throughout Sections 1-6 of this chapter, k will denote an alge- 
braically closed field.! We fix a positive integer n and denote by A the polynomial 
ring A = k[X,,..., X,]. Typical ideals of A will be denoted by a, b,.... We 
begin by expanding on some definitions made in Section VIII.2. The set 


A’ = NOt A) € k"} 


is called affine n-space. Members of A” are called points in affine n-space, and 
the functions P +> x;(P) give the coordinates of the points. 

To each subset S of polynomials in A, we associate the locus of common 
zeros, or zero locus of the members of S: 


V(S)={P€A"| f(P)=0 forall f € S}. 


Any such set V(S) is called an affine algebraic set in A”. If S is a finite set 
{fi,---» fg} of polynomials, we allow ourselves to abbreviate V ({f,..., fx}) 


'The exposition in these sections is based in part on Chapters 2, 4, and 6 of Fulton’s book, 
Chapter I of Hartshorne’s book, and Chapter I of Volume 1 of Shafarevich’s books. 
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as V(fi,..-, fx). It is immediate from the definitions that V(.$) is the same as 
V (a) if ais the ideal in A generated by S. The Hilbert Basis Theorem shows that 
every ideal of A is finitely generated, and it follows that every affine algebraic set 
is of the form V (1, ..., fx) for some k and some polynomials f|,..., fx. 

In Chapter VIII we worked extensively with examples of ideals of A and their 
corresponding affine algebraic sets, and it will not be necessary to give further 
examples of that kind now. 

Observe from the definition that V(S) = ()} Fes Vf) for any subset S of A. It 
follows immediately that S +> V(S), as a function carrying each subset S of A 
to a subset V(S) of A”, is inclusion reversing: S; C Sz implies V(S;) > V(S2). 
Using this same identity, we obtain the following further properties of V. 


Proposition 10.1. Affine algebraic sets in A” have the following properties: 
(a) V(@) = VCO) = A” and V(A) = ©, 
(b) V(U, Se) = Me V (Sa) if the S,’s are arbitrary subsets of A, 
(c) V(S) = V(S,) U V(S2) if S; and Sz are subsets of A and if S is defined 
as the set of all products f; fo with f; € S; and fo € So. 


PROOF. Property (a) is immediate. For (b), we have 


V(USJ= NM VAM=ANN VA)=NVie). 
a feUg Sa a fESy a 


For (c), we observe first that V(fi fo) = V(fi) U V(fo) for any fi; and fo in A. 
Then 


VS= 1) VA= 1M N (VA) UV(f)) 


ey SiES| frESy 
= (0 V(fi)) U (1 V(f2)) = V(S1) U V(S2). 


Properties (a), (b), and (c) in the proposition are the axioms for the closed 
sets in a topology on A”. This topology is called the Zariski topology on affine 
n-space. Every one-point set is closed. The Zariski topology on A” is never 
Hausdorff; for example, if n = 1, then it is the topology on k! = k in which the 
nonempty open sets are the complements of the finite sets. Since one-point sets 
are closed and the topology is not Hausdorff, the Zariski topology on A” is never 
regular. At first glance it looks like a useless topology, but we shall see already 
in Proposition 10.3b and again in Section 2 that it is quite helpful for handling 
the bookkeeping used in passing back and forth between algebra and geometry. 

Next we introduce a function E + /(£), carrying each subset E of A” to an 
ideal (E) in A, by the definition 


I(E)={f € A| f(P) =0 forall P € E}. 
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Then 1[(E) = ()\per I ({P}). It follows immediately that E +> I (E) is inclusion 
reversing: FE; C E> implies /(£,) > [(E2). The result for /(-) that parallels 
Proposition 10.1 is as follows. 


Proposition 10.2. For fixed n, the function /(- ) has the following properties: 
(a) 1(@) = A and J(A) = 0, 

(b) [(E, U Ex) = I(E,) NI (Ed) if E; and E> are subsets of A”, 

(c) [(F,N Ey) D ICE,) + I (E>) if E; and E> are subsets of A”. 


REMARKS. Equality can fail in (c). For example, if F; is the one-point set {0} 
and EF» is its complement, then /(£; M E2) = 1(@) = A, while /(E2) = 0 and 
I(£;) consists of all members of A with 0 constant term. 


PROOF. Property (a) is immediate. For (b), we have 

WE\UE)= (1) dP) =( NM IMPD)AC NT 1AP)) = T(E) NT (22). 
PEE, VE, PEE, PeE, 

In (c), the fact that 7(-) is inclusion reversing implies that /(E, 9 EF.) D> I(E)) 

and that J(E, 9 E2) D> I(E2). Since I(£, N E>) 1s closed under addition, (c) 

follows. 


This is all quite elementary. The less trivial question is the extent to which 
V(-) and /(-) are inverse to one another. Proposition 10.3 gives the answer. 


Proposition 10.3. For fixed n, 
(a) I1(V(a)) = Ja for each ideal a in A, 
(b) V(I(E)) = E for each subset E of A”, where E is the Zariski closure 
of E, 
(c) V(a) = V(./a) for each ideal a in A, 
(d) any two ideals a and b in A have ab C aM 6 © Vab and consequently 
have V(aM 6b) = V(ab) = V(a) U V(b). 


REMARKS. Recall from Section VII.1 that ./a denotes the radical of a, con- 
sisting of all f in A such that f* is in a for some integer k > 1. The radical of a 
equals a itself if a is prime. 


PROOF. Conclusion (a) is the Nullstellensatz as formulated in Theorem 7.1b. 

For (b), the definitions show that V(/(E)) > E. Since any set V (S) is Zariski 
closed, we must have V(J(E)) > E. On the other hand, the fact that E is closed 
means that E = V(S) for some S. Thus V(S) = E D E, and the inclusion- 
reversing property of [(-) gives 1(V(S)) C I(E). Since the definitions imply 
that S C J(V(S)), we obtain S C J(E). From the inclusion-reversing property 
of V(-), we conclude that E = V(S) > V(I(E)). 

For (c), (a) and (b) give V(./a) = VU(V(a))) = V(a) = V(a) because V (a) 
is closed. 
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For (d), the inclusion ab C aM 6 is immediate. If f is in aM 6, then f is 
in a and in b, and hence f? is in ab. Thus f is in /ab. Applying V(-) gives 
V(ab) > V(anb) > V(vVab). Since V(ab) = V(vab) by (c), Vian b) = 
V (ab). Finally V(ab) = V(a) U V(b) by Proposition 10.1c. 


An affine variety is any affine algebraic set of the form V(p), where p is a 
prime ideal? of A. That is, an affine variety is the locus of common zeros of any 
prime ideal of A. 

For example, if f is an irreducible polynomial in A, then f is prime because A 
is a unique factorization domain, and consequently the principal ideal (f) is prime. 
Thus the zero locus in A? of an irreducible polynomial f ink[X, Y] is an example 
of an affine variety. This particular kind of affine variety is called an irreducible 
affine plane curve.*:+ More generally, if f is irreducible in A = k[X),..., Xn] 
with n > 2, then the zero locus of f in A” is called an irreducible affine 
hypersurface.” Another example of an affine variety is any translate of any vector 
subspace of A”. Examples of affine varieties other than irreducible hypersurfaces, 
translates of vector subspaces, and varieties built from other varieties in simple 
ways often take some work to establish. The reason is that it is usually not easy 
to show that a particular nonprincipal ideal is prime. Here is one example that is 
manageable. 


EXAMPLE. The twisted cubic in A? is the zero locus V(p) of the ideal p in 
k[X, Y, Z] given by p = (Y — X?, Z — X°); thatis, V(p) = {(x, x7, x3) | x © k}. 
The substitution homomorphism ¢ that fixes k and sends X to X, Y to X 2 and 
Z to X? carries k[X, Y, Z] into k[X]. It is onto kX] because any polynomial in 
X alone is sent to itself by g. The kernel of g manifestly contains p. To see that 
it equals p, we argue by contradiction. Choose a polynomial f in ker ¢g not in p 
whose degree in Z is as small as possible and whose degree in Y is as small as 
possible among those of minimal degree in Z. If Z occurs somewhere in f , then 
by replacing all occurrences of Z in f with X?, we replace f by another member 
of f + p of lower degree in Z, contradiction. Thus f has no Z in it. Arguing 


?Warning: The books by Fulton and Hartshorne in the Selected References use the narrow 
definition of variety that is reproduced here. Some books by other authors allow all affine algebraic 
sets to be called varieties. Volume 1 of Shafarevich’s books does not use the word “variety.” 

3 Warning: This definition represents a change from Chapters VIII and IX, corresponding to a 
change in point of view. Previously the word “curve” referred to the ideal, and now it is to refer to 
the zero locus. From a mathematical standpoint Proposition 10.3 shows that this distinction is not 
important in the presence of the irreducibility and the fact that k is algebraically closed. The change 
thus represents only a matter of convenience for the exposition. 

4Some authors build the condition of irreducibility into the definition of “curve.” but this book 
does not. 

>Some authors build the condition of irreducibility into the definition of “hypersurface.” but this 
book does not. 
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similarly, we see that f has no Y init. So f is a polynomial in X. Since @ acts 
as the identity on polynomials in X alone, f = 0. This contradiction shows that 
kerg = p. Since imageg = k[X] is an integral domain, p is prime. By the 
Nullstellensatz, p may be described alternatively as the ideal of all polynomials 
vanishing on V (p). 


Every affine variety is nonempty, as a consequence of the Nullstellensatz. In 
fact, any prime ideal p of A is contained in a maximal ideal m, whose zero locus 
is identified as some point P of A”. The inclusion p C m implies that V(p) D 
V(m) = {P}. Affine varieties are characterized by a geometric irreducibility 
property that is stated in Corollary 10.4. 


Corollary 10.4. The affine varieties in A” are characterized as those nonempty 
Zariski closed sets that cannot be written as the union of two proper closed subsets. 


REMARKS. One says that the affine varieties are those affine algebraic sets that 
are irreducible. Irreducible sets are nonempty by definition. 


PROOF. Let V (p) be an affine variety with p prime, and suppose that V (p) = 
E, U E> with F, and E> both closed and properly contained in V (p). Application 
of J(-) and use of Proposition 10.2b gives 1 (V (p)) = [(E,)N/ (Ez). Proposition 
10.3a allows us to rewrite this conclusion as p = b; M b2 with bj = /(£;) and 
by = I(E2). By Problem 10a at the end of Chapter VII, p = 6; or p = bo. If 
p = b;, then V(p) = V(b,) = VU (E})), and this equals E; by Proposition 
10.3b because EF; is closed. Similarly if p = 62, then V(p) = Ex. Thus EF; and 
E> cannot both be proper subsets of V (p). 

Conversely suppose that E is an irreducible closed subset of A”. Let f and 
g be members of A with fg in /(£). Then Propositions 10.3b and 10.1c give 
E=V(I(E)) C V(fg) = V(f) U V(g). Therefore 


E =(ENV(f)) U(EN V(g)) 
exhibits E as the union of two closed sets. By irreducibility one of the two closed 
sets equals FE. If FE = EN V(f), then E C V(f) and I(E) > I(V(f)) 2 (Cf). 


If E = EN V(g), then similarly 7(£) D> (g). Either way, one of f and g lies in 
I(E). Since E is assumed nonempty, /(E) is proper. Therefore /(£) is prime. 


2. Geometric Dimension 


We continue to assume that k is an algebraically closed field and to write A 
for k[X,,..., X,]. If p is a prime ideal in A, then the dimension of the affine 
variety Vp) was defined in Section VII.2 to be the transcendence degree of the 
field of fractions of the integral domain A/p over k. This quantity depends only 
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on V(p) because p can be recovered from V(p) by the formula p = /(V(p)) 
given in Proposition 10.3a. The integral domain A/p is finitely generated as a 
k algebra with generators X; + p,..., X, +, and Theorem 7.22 shows that 
this transcendence degree equals the Krull dimension of the ring A/p, which is 
denoted by dim A/p. The latter quantity is the supremum of the indices d of all 
strictly increasing chains po S pi S --- S pa of prime ideals in A/p. 

Because of this equality, it is natural to use the notion of Krull dimension in 
order to generalize the definition of dimension from varieties to all nonempty 
affine algebraic sets.° If a is an any proper ideal in A, not necessarily prime, and 
V (a) is its locus of common zeros, we might first try defining dim V (a) to be the 
Krull dimension of A/a. This approach is a bit cumbersome because two distinct 
ideals a and a’ can have V(a) = V(a’); thus some argument would be needed to 
see that dim V (a) is well defined before it would be possible to proceed. 

Instead, we shall give a direct geometric definition of dimension in terms of 
the Zariski topology on A”. Theorem 10.7 later in this section will show that the 
geometric quantity dim V (a) equals the Krull dimension of A / ./a , thus that the 
dimension of an affine algebraic set has an algebraic formulation. From this result 
we shall deduce that dim V(a) equals the Krull dimension of A/a itself. This 
algebraic formulation of a definition will not yet allow us to compute dimensions 
concretely, but we shall introduce in Sections 9-11 an equivalent combinatorial 
definition of dimension that is computable in terms of Grébner bases. 

A topological space X will be said to be Noetherian if every strictly decreasing 
sequence of closed subsets is finite in length. An example is affine n-space A”. 


In fact, if E,, E>,... are closed sets in A” with FE; D> E>, D ---, then the 
corresponding ideals have J(E£;) C I(E2) C ---. Since A is Noetherian, there 
exists some integer k with 1(E;,) = I(Ex+1) = ---. Applying V(-) and using 


Proposition 10.3b, we obtain EF, = Ex41 =---. 

We can generalize the definition of irreducibility for closed sets from A” to 
an arbitrary Noetherian topological space. Namely a nonempty closed set E is 
irreducible if it is not the union of two proper closed subsets. An important ob- 
servation about any Noetherian topological space is that any nonempty relatively 
open subset U of an irreducible closed set V is dense in V; in fact, if U denotes 
the closure of U, then V = U U(V — U) exhibits V as the union of two closed 
subsets, and the irreducibility forces U = V since V—U 4 V. 


Proposition 10.5. If X is a Noetherian topological space, then any closed 
subset is the finite union of irreducible closed subsets. This decomposition of a 
closed set as such a union may be chosen in such a way that none of the closed sets 
in the union contains another set in the union, and in this case the decomposition 
is unique. 


We shall leave the dimension of the empty set as undefined for now. 
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PROOF. For existence of some decomposition of each closed set as a finite 
union of irreducible closed subsets, we argue by contradiction. Assuming that 
there exists some closed subset E of X that is not the finite union of irreducible 
closed subsets, we may assume by the Noetherian condition on X that E is minimal 
among all such counterexamples. Since E cannot itself be irreducible, we can 
write E = EF, U E>, with EF; and E>» closed and properly contained in EF. Since 
E is minimal among all closed subsets that are not the finite union of irreducible 
closed subsets, E; and FE» can be expressed as finite unions of irreducible closed 
subsets. Substituting these expressions into the equality E = FE U E> givesa 
contradiction to the fact that E is a counterexample. 

This proves existence of a decomposition. By going through the sets in the 
decomposition one at a time and by discarding any set that is contained in another 
set, we obtain a decomposition as in the second sentence of the proposition. 

For uniqueness, suppose that EF = FE, U--- UE, = F, U---U F7 gives two 
decompositions of the asserted kind. Say thatk > 1. Since F; C E,U---U---UEx, 
we obtain F; = (F; ND E,) U---U (F; N Ex). Irreducibility of F; implies that 
F, = F; 9 Ej) for some j = j(i). Hence F; C Eji) for some function j (7) 
from {1,...,/} to {1,...,k}. Reversing the roles of the £;’s and the Fj’s yields 
a function i(j) such that FE; C Fij). Then F; © Ejay © Fiji). Since no F; 
contains some F; with i’ 4 i, we conclude that i(j(i)) = i for all i. Therefore 
k =1,and i(-) and j(-) are inverse to each other. 


Corollary 10.6. Every affine algebraic set in A” can be expressed uniquely as 
the finite (possibly empty) union of affine varieties in such a way that none of the 
varieties contains another of the varieties. 


REMARKS. For example, 
V(X? — Y”) = V(X +Y)UV(X —Y) 


by Proposition 10.1c, and the affine algebraic set on the left side is expressed as 
the union of the affine varieties on the right. 


PROOF. We saw before Proposition 10.5 that A” is a Noetherian topological 
space, and Corollary 10.4 shows that the irreducible subsets are the affine varieties. 
The closed sets are the affine algebraic sets by definition, and hence the result is 
a special case of Proposition 10.5. 


The geometric dimension of a nonempty closed subset E of a Noetherian 
topological space X is the supremum of the integers d > O such that there exists 
a strictly increasing chain Ey & E; & --- & Ezy of irreducible closed subsets 
of E. This definition makes sense because a chain with d = 0 can always be 
formed with Eo equal to one of the irreducible closed sets from Proposition 10.5; 
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however, there is no guarantee in this generality that the geometric dimension 
will be finite. In any event, it is clear from the definition that if two closed sets 
E and E’ have E C E’, then the geometric dimension of E is < the geometric 
dimension of E’. 

In the case of a nonempty affine algebraic set V (S), the geometric dimension 
of V(S) is to refer to this kind of dimension relative to the Zariski topology. 


EXAMPLES OF GEOMETRIC DIMENSION IN A”. 

(1) Any one-point set in A” is closed and plainly has geometric dimension 0. 
Any affine variety V with more than one point has geometric dimension > 1, 
since {P} c V is a strictly increasing chain of irreducible closed sets if P is 
chosen as a point in V. 

(2) A” has geometric dimension n. This fact will follow from Theorem 10.7 
below because A has Krull dimension n as a consequence of Theorem 7.22. 

(3) Twisted cubic in A?, namely {(x,x?,x°) | x € k}. According to the 
example in Section 1, this is V(p) for the prime ideal p = (Y — X= XV ES 
k[X, Y, Z]. The inclusions of prime ideals (X,Y, Z) 2 (Y — X*,Z — X*) 2 
(Y — X*) 2 0 give the strictly increasing chain {0} G V(p) S {(@x, x7, 2)} GA’, 
which is of the kind described for A>. If another term could be included between 
{0} and V(p), then we would obtain a sequence showing that A? has geometric 
dimension > 4, in contradiction to Example 2. So V (p) has geometric dimension 
< 1. In view of Example 1, V(p) has geometric dimension equal to 1. 


Theorem 10.7. If ais any proper ideal of A, then the following four quantities 
are equal: 
(a) the geometric dimension of V (a), 
(b) the Krull dimension of A/./a, 
(c) the maximum of the geometric dimension of V; over all affine varieties 
V; contained in V (a), 
(d) the Krull dimension of A/a. 


REMARKS. We take these equal quantities as the definition of the dimension 
dim V (a) of the affine algebraic set V(a). Because of Theorem 7.22, these 
quantities equal the transcendence degree over k of the field of fractions of A/a 
in the case that a is a prime ideal. For a = 0, we know that dim A = n; hence 
the equal quantities in the theorem are < n. 


PROOF. Let 
Eo CE, C-:-C Eg (*) 


be an increasing chain of irreducible closed subsets of V(a), and define p; to be 
the ideal p; = /(£;). Then each p; is a prime ideal by Corollary 10.4, and also 


pa S++: Spr CS Po (2) 
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because / (- ) is inclusion reversing. If («) is strictly increasing, then so is (#*); in 
fact, if p; were to equal p;_; for some j, then we would have FE; = VUI(Ej)) = 
Vip) = Vipj-1) = VU (4j-1)) = Ej-1, contradiction. In (*), we have Eg © 
V(a), and thus Proposition 10.3a gives ./a = 1(V(a)) C I (Eq) = pa. In other 
words, any strictly increasing sequence (+) of irreducible closed subsets of V (a) 
yields a strictly increasing sequence (+) of prime ideals of A that contain /a. 

Conversely if (+) is a strictly increasing sequence of prime ideals of A con- 
taining ./a, and if we define E ; = V(p;) for 0 < j < d, then we obtain the 
sequence («) of irreducible closed subsets of V(./a) = V(a), and (x) is strictly 
increasing, since an equality E; = Ej;_,; would imply that pj = /(V(p;)) = 
(Ej) = 1(Ej-1) = 1(V (pj-1)) = pj-1 because of Proposition 10.3a. 

Thus the strictly increasing sequences (*) of irreducible closed subsets of V (a) 
are in one-one correspondence with the strictly increasing sequences (+) of prime 
ideals of A containing /a.Letg: A> A / ./a be the quotient homomorphism. 
Application of g to (**) yields a strictly increasing sequence of ideals of A re Ja 
by the First Isomorphism Theorem, and prime ideals map to prime ideals under 
this correspondence. Thus the existence of a strictly increasing sequence as in 
(+k) implies that the Krull dimension of A / /a is > d. Meanwhile, the existence 
of a strictly increasing sequence as in (*) implies that the geometric dimension of 
V(a) is > d. We have seen that these sequences are in one-one correspondence, 
and therefore the equality of (a) and (b) in the theorem follows. 

In (c) certainly the geometric dimension of any V; is < the geometric dimension 
of V (a). If dp denotes the geometric dimension of V (a), then we can find a strictly 
increasing chain as in («) with d = dp and with all the sets contained in V (a). 
Corollary 10.4 shows that £4, is an affine variety contained in V(a), and the 
sequence (*) shows that the geometric dimension of Eg, is at least dy). Thus 
V; = Eq, is an affine variety contained in V(a) whose geometric dimension 
equals that of V (a). 

To complete the proof, we show the equality of (b) and (d), i.e., we show that 
A/aand A / /a have the same Krull dimension. Since a C ./a, it is enough to 
show that in any strictly increasing sequence of prime ideals as in (+) such that 
all the ideals contain a, all the ideals actually contain ,/a. (Then the sequences 
(«) for a will be in one-one correspondence with the sequences for ./a , and we 
can argue using the First Isomorphism Theorem as in the third paragraph of the 
proof.) Thus let x be in /a. By definition of radical, x* lies in a for some k. 
Since a C py, x lies in py. But pg is prime, and therefore x lies in py. Thus 
every ideal in the sequence (**) for a occurs in the sequence (**) for ./a, and 
the theorem follows. 


The dimension of an irreducible hypersurface in A = k[X,,..., X,] isn —1, 
as was observed in Section VII.5. Proposition 10.9 below will prove a converse. 
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Lemma 10.8. Every minimal nonzero prime ideal in A is principal. 


PROOF. Let p be a minimal nonzero prime ideal, let f 4 O be a nonzero 
member, and write f as the product of irreducible elements. Since p is prime, 
one of the irreducible elements, say g, lies in p. Since A is a unique factorization 
domain, g is prime. Consequently (g) is a prime ideal of A lying in p. By 
minimality of p, p = (g). 


Proposition 10.9. Suppose that p is a prime ideal of A and V(p) is the 
corresponding affine variety. If dim V (p) = n — 1, then p is principal, and hence 
V (p) is an irreducible hypersurface. 

PROOF. For any n > 1, dim V(p) =n —1 <n = dimV(O) implies p ¥ 0. 
Since dim V (p) = n — 1, there exists a chain 


0O=qSquS--:San-1 


of prime ideals in A/p. If g : A — A/p denotes the quotient homomorphism, 
then this chain lifts to A as 


0SpSe'q1) S---Se' Guy. 


This chain has n members after the 0 at the left, and A has Krull dimension n. 
Consequently the first nonzero element, which is p, is a minimal nonzero prime 
ideal of A. By Lemma 10.8, p is principal. 


A quasi-affine variety is any nonempty Zariski open subset of an affine variety. 
These sets and their projective analogs, which will be defined in Section 3, will be 
the main objects of interest geometrically in Sections 1-6. If Y is a quasi-affine 
variety, then the closure Y is the affine variety in question because any nonempty 
relatively open subset of an affine variety is dense in the variety.’ 

Let us see that the relative Zariski topology on a quasi-affine variety Y makes 
Y into a Noetherian topological space. In fact, if X is a Noetherian topological 
space and Y is a topological subspace, then Y is Noetherian. To see this, we 
argue by contradiction, letting E; > Ey D --- be astrictly decreasing sequence 
of relatively closed sets in Y. Then the sequence of closures in X forms a 
decreasing sequence of closed sets in X with the property that E; = Y M E; for 
each j because E; is assumed to be relatively closed in Y. It follows that the 
sequence of closures is strictly decreasing, contradiction. 

Consequently any quasi-affine variety Y is Noetherian in the relative Zariski 
topology and has a meaningful geometric dimension. We write dim Y for this 
dimension. 


7This important observation was made just before Proposition 10.5. 
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Lemma 10.10. If Y is a quasi-affine variety in A" and if E is a nonempty 
relatively closed subset of Y, then E is irreducible® for Y if and only if E is 
irreducible for A”. 


REMARKS. We shall actually prove the stronger result that if Y is a nonempty 
open subset of a Noetherian topological space X (such as A”) and if E is a 
nonempty relatively closed subset of Y, then E is irreducible for Y if and only if 
E is irreducible for X. This stronger result will be used in Section 3. 


PROOF. First we check that E reducible implies E reducible. If E is reducible, 
say is aunion E = E, U E» with EF, and E, relatively closed proper subsets of 
E, then E = E; U Ep. Each of E; and E> is a closed subset of E. To see that 
E; is proper, we argue by contradiction. If E, = E, then intersecting both sides 
with Y gives the contradiction E; = YN E, = YNE = E because E; and E 
are both relatively closed. Similarly E> is proper, and thus E is reducible. 

Conversely suppose that E is reducible, say is a union E = F, U F) with F, 
and F> closed in X and properly contained in E. Intersecting both sides with 
Y givesE=YNE=YO(F{\UF) = (YNF,))U(WN F)) because E is 
relatively closed. The sets Y M F; and Y / F» are relatively closed, and their 
union is E. To see that E is reducible, we argue by contradiction. If YN F; = E, 
then E C F;. Since F, is closed in X, E C F,. Thus F; is not a proper subset 
of E, contradiction. Similarly we cannot have Y M Fo = E, and therefore E 
is exhibited as the union of the two proper relatively closed subsets Y M F; and 
YO Fp. 


Proposition 10.11. If Y is a quasi-affine variety in A”, then dim Y = dim Y: 
Here dim Y refers to the dimension of the affine variety Y in any of the senses of 
Theorem 10.7. 


REMARKS. This proposition is a formal consequence of Lemma 10.10. The 
stronger statement that we actually prove is that if Y is a nonempty open subset 
of a Noetherian topological space X, then the geometric dimension of Y as a 
Noetherian space equals the geometric dimension of X as a Noetherian space. 


PROOF. Let Ey C FE; C--- C Ey beastrictly increasing sequence of relatively 
closed irreducible subsets of Y. Then Eg C E; C --- C Eq is an increasing 
sequence of closed subsets of A”, each of which is irreducible by Lemma 10.10. 
Since E; = YM E; for each j, the sets E; are strictly increasing. Since the given 
sequence of sets E; is arbitrary, it follows that dim Y < dimY. 

For the reverse inequality, let Fo C Fi C --- C Fy bea strictly increasing 
sequence of irreducible closed subsets of Y. If E; denotes F; NY, then Eg C 


8... in the sense of not being the union of two relatively closed proper subsets. 
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E, C --- © Eg is an increasing sequence of relatively closed subsets of Y, 
each of which is irreducible by Lemma 10.10. Since Fj = = E; j» the sets Ej are 
strictly increasing. Since the given sequence of sets F’; is arbitrary, it follows that 
dim Y < dimY. 


3. Projective Algebraic Sets and Projective Varieties 


We continue to assume that k is an algebraically closed field and to write A 
for k[X,,..., X;]. In Section VIII.3 we studied the projective analogs of affine 
plane curves, and the task for the present section is to study similarly the projective 
analogs of general affine algebraic sets, affine varieties, and quasi-affine varieties. 
As in Section VIII.3, projective n-space over k is defined set theoretically as 
the quotient 
P" = {(xo,....%n) € k"*! — {O}}/ ~, 


Where Oijp ei se) OP aise Mle aga wneg he) DANO nee) LOL “SOME 
rE k*, We write [xo,...,X»] for the class of (xp, ...,.X,) in P”. 
Put A = k[Xo,..., Xn]. The polynomials of interest for algebraic geometry 


relative to P” are the homogeneous polynomials in A. The definitions of “mono- 
mial,” “total degree” of a monomial, “homogeneous polynomial,” and “degree” of 
ahomogeneous polynomial all appear in Section VIII.3; monomials are defined so 
as to have coefficient 1. By convention the 0 polynomial is homogeneous of every 
degree. We write Ag = k[Xo,..., XnJa for the k vector space of homogeneous 
polynomials of degree d. Each member F of Ag satisfies 


F (Axo, ...,AXn) = ATF (xo, ..-5 Xn) 


for all (xo, ..., Xn) € k"*! and A € k*. Conversely the fact that the mapping of 
polynomials into polynomial functions is one-one for an infinite field implies that 
a member F of A is homogeneous of degree d if it satisfies the above displayed 
property. Four further properties of Ag from Section VIII.3 are that 


the zero locus of a member of Ay is well defined as a subset of P”, 

the monomials of total degree d form a k basis of the vector space Ari: 
dim, oe = ee 

any polynomial factor of a homogeneous polynomial over a field k is 
homogeneous. 


An ideal a in A is called a homogeneous ideal if it is the vector-space 
sum over d > 0 of its intersections with Ag: a = Bie (aN Aa). Any ideal 
in A that is generated by homogeneous polynomials is a homogeneous ideal. A 
special case of this fact is that if ak vector subspace ag of Ag is specified for each 
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integer d > 0, thena = Qzo au is a homogeneous ideal if and only if for each 
d > 0 and e > 0, the inclusion F Age Cc Adie holds for each F in Ae. 

We can now imitate some of the development of Sections 1 and 2 for the 
present context as long as we stick to homogeneous polynomials in A and to 
homogeneous ideals. For any homogeneous polynomial F in A, the set 


Viij=H={P Sie. 6 lel | FOusi:, 4 = 0} 


is well defined by the first bulleted property above. Thus if S is any set of 
homogeneous elements in A, we can associate the locus of common zeros in P”, 
or zero locus, of the members of S by the formula 


V(S) = (1) VF). 


FeS 


If a is a homogeneous ideal, then V (a) by convention means V (S$), where S is 
the subset of all homogeneous members of a. Any such set V(S) is called a 
projective algebraic set in P”. The function § ++ V(S) is inclusion reversing. 
The analog of Proposition 10.1 in the present context is that projective algebraic 
sets have the following properties: 
(i) V(@) = VO) = P" and V(A) = 
(ii) V(U, Se) = (), V(Sq) if the Sy’s are arbitrary sets of homogeneous 
elements in A : 
(ili) V(S) = VS) U V(S2) if S; and Sp are sets of homogeneous elements 
in A and if S is defined as the set of all products FF with F; € S, and 
Fy € Sp. 
Consequently the projective algebraic sets in P” form the closed sets for a topology 
on P” called the Zariski topology on P”. 2s 
Next we associate to each point P of P” a homogeneous ideal /(P) in A by 
the definition 


EP ee A| F(xo,..-,%n) = 0 whenever [xo,..., Xn] = P}. 


Problem 1 at the end of the chapter shows that J (P) is indeed a homogeneous 
ideal. In terms of the ideals /(P), we define 1(E) = ()\pe,1(P) for each 
subset E of P". The result E +> /(E) is a function carrying subsets E of P” to 
homogeneous ideals /(£) in A”. The function E +» J (£) is inclusion reversing, 
and the same argument as for Proposition 10.2 shows that for each n it satisfies 
(i) 1(@) = A and /(P") =0, 
Gi) T(E; U Eo) = I(£,) NI (E>) if E, and E> are subsets of P”, 
Gil) T(E; Eo) D I(E£,) + I (£2) if E; and E> are subsets of P”. 
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If S is any set of homogeneous elements in A and if V = V(S) is the 
corresponding projective algebraic set in P”, then we define the cone over V 
to be the subset of A”*! given by 


CW) = 0215250) U1 Goyerss te) CA | Dorescitel EV} 


This kind of set has the following two properties: 
(i) V nonempty implies that the ideals /(C(V)) and 7(V) in A are equal, 
(ii) any homogeneous ideal a in A with V(a) nonempty in P” has C(V (a)) 
equal to the subset V (a) in affine (n + 1)-space. 
Use of this device reduces a number of questions about P” to questions about 
A"*!, An example is a projective analog of Proposition 10.3, which appears as 
the next proposition. 


Proposition 10.12. For fixed n, 

(a) (homogeneous Nullstellensatz) a homogeneous ideal a in A has V(a) 
empty in P” if and only if there is an integer N such that a contains A, 
fork > N, . 

(b) 1(V(a)) = a for each homogeneous ideal a in A for which V(a) is 
nonempty in P”, 

(c) VU(E)) = E for each subset E of P”, where E is the Zariski closure of 
EinP”. 


REMARK. For clarity in the proof, let us write V,(-) and V,(- ) to distinguish 
zero loci in A”*! from zero loci in P”. 


PROOF. For (a), V,(a) is empty in P” if and only if V,(a) is contained in 
{0} in A"*!, if and only if ./a = I(V,(a)) contains (Xo, ..., X») by the affine 
Nullstellensatz. In this case if f|,..., f; are generators of ./a , then the elements 
fi". -++> f;" are in a for some m, and it follows that ea Cr aM lies in a for 
all scalars c; whenever k > rm; hence Ag C afork > rm. Conversely if /a 
fails to contain some X;, then x is not in a for any k > 1, and Ay cannot be 
contained in a. 

For (b), Jp (Vp(a)) = Ta (C (Vp (a))) = Ta(Va(a)) = Ja by (i) of cones, (ii) of 
cones, and the affine Nullstellensatz. 

Conclusion (c) is proved by the same argument as for Proposition 10.3b. 


A projective variety is any nonempty? projective algebraic set of the form 
V (p), where p is a prime homogeneous ideal in A. If the ideal p is the principal 


°The prime homogeneous ideal p = (Xo, ..., Xn) has V(p) = ©, but no other prime homoge- 
neous ideal q has V(q) = ©. In order to avoid trivial counterexamples to some results, we shall 
often want to exclude this particular prime ideal p from consideration. 
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ideal generated by an irreducible homogeneous polynomial, then the ideal or the 
variety is called an irreducible projective hypersurface.'° 


Corollary 10.13. The projective varieties in P” are characterized as those 
nonempty Zariski closed sets that cannot be written as the union of two proper 
closed subsets. 


REMARK. Such a subset of P” is said to be irreducible. As in the affine case, 
irreducible sets are understood to be nonempty. 


PROOF. If V(p) is a projective variety, then the union of {0} and the subset 
of k”*+! whose equivalence classes are in V(p) is an affine variety in A”*!. It is 
irreducible in A"*!, and this irreducibility in A+! implies irreducibility within P”. 

Conversely if E is an irreducible closed subset of P" and if F and G are 
homogeneous members of A with FG in J (E), then we can argue as in the proof 
of Corollary 10.4 to see that one of F and G lies in J (£) and that J (£) is proper. 
Since / (E) is a homogeneous ideal, this fact implies that / (E) is prime. 


Since A is a Noetherian ring, it follows that P” is a Noetherian topological 
space in the sense of Section 2. Consequently Proposition 10.5 is applicable. 
Combining this result with Corollary 10.13, we obtain the following corollary. 


Corollary 10.14. Every projective algebraic set in P” can be expressed 
uniquely as the finite (possibly empty) union of projective varieties in such a 
way that none of the varieties contains another of the varieties. 


Geometric dimension is therefore meaningful for nonempty projective alge- 
braic sets, and each such set in P” has geometric dimension < n. 

A quasiprojective variety is any nonempty Zariski open subset of a projective 
variety. Quasi-affine varieties and quasiprojective varieties will be the main 
objects of interest geometrically in Sections 1-7. If Y is a quasiprojective variety, 
then the relative Zariski topology on Y makes Y into a Noetherian topological 
space, just as in the quasi-affine case. Consequently Y has a meaningful geometric 
dimension. The arguments in Lemma 10.10 and Proposition 10.11 concerning 
quasi-affine varieties are arguments in point-set topology and valid proofs of facts 
about quasiprojective varieties. Therefore we obtain the following result. 


Proposition 10.15. If Y is a quasiprojective variety in P”, then the closure Y 
in the Zariski topology of P” is a projective variety, and the geometric dimensions 
of Y and Y are equal. 


'0As in the affine case, as long as the assumption of irreducibility is in force, the distinction 
between the ideal and the variety is unimportant. 
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We can identify A” as a subset of P” by the formula 


Bo(x1, eae a9) = [1,x1, kX | 


for (x1,...,X,) in A”. The complement of 6o(A”) in P” is the zero locus 
of the homogeneous polynomial Xo, and consequently Bo(A”) is open in P”. 
Since the equality P” = V(O) exhibits P” as a projective variety, Bo(A”) is a 
quasiprojective variety. We are going to show that Bp respects topologies in that 
the Zariski topology of A” is carried to the Zariski topology of the quasiprojective 
variety Bo(A"). To do so, we make use of the corresponding transpose mapping 
B, : A—> Aon polynomials given by 6) F = f with 


S(X,..., Xn) = F(Bo(X1,..., Xn)) = FC, X1,..., Xn). 


This is the substitution homomorphism that fixes k, fixes X1,..., Xn, and carries 
Xo to 1. Being an algebra homomorphism onto, fj carries ideals of A to ideals 
of A. In particular, it carries homogeneous ideals of A to ideals of A. 


Lemma 10.16. If a is a homogeneous ideal in A and 6 = Bo (a) is its image 
under £), then fj carries the set of homogeneous elements of a onto 6. 


PRrooF. Every member of 6 is the sum of the images under £) of finitely many 
homogeneous members of a. If Fi,..., F% are these homogeneous members, 
then it is enough to produce G,,..., Gz in aall homogeneous of the same degree 
such that 6)(F;) = 6)(G;) for all j. If d),...,d, are the respective degrees of 


Fi,..., Fy and if d = max(d),..., dy), then the elements G; = xo “FR, have 
the required properties. 


Lemma 10.17. Let a be a homogeneous ideal of A , and let 6 be the ideal of A 
given by b = Bi(a). Then Bo(V(6)) = V(a)N Bo(A”). 


PROOF. If (x1,...,Xn) is in V(6) and if F is a homogeneous member of a, 
then f = Bi(F) isin b withO = f(11,...,X%n) = F(Bo(%1,...,Xn)). Since F 
is arbitrary, Bo(x1,...,Xn) is in V(a). Thus Bo(V(6)) © Va) ON Bo(A”). 

For the reverse inclusion, let [1, x1,...,x,] be in V(a) NM Bo(A”). If f is 
in 6, find by Lemma 10.16 a homogeneous F in a with 6jF = f. Since 
[1,x1,...,X,] isin V(a), FC, x1,...,X%,) = 0. Therefore f(x,...,%,) = 
F(Bo(%1,.--,%)) = FU, x1,...,Xn) = 0. Since f is arbitrary in 6, the point 
(x1,...,%,) is in V(b), and Bo(V (b)) D V(a) N Bo(A”). 


Proposition 10.18. Under the inclusion Bp : A” — P", the Zariski topology 
of affine n-space A” coincides with the relative topology from P”. 
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PROOF. If we start from an affine algebraic set V(b) in A”, then Lemma 10.17 
shows that Bo(V (6)) = V (a) Bo(A”) for the homogeneous ideal a = @.)* (6) 
in A. Since V (a) is Zariski closed in P”, Bo(V (6)) is exhibited as closed in the 
relative topology on Bo(A”). 

Conversely suppose that C is closed in the relative topology on Bo(A”). Then 
it is of the form C M Bo(A”") for some projective algebraic set C. The set C is of 
the form V (a) for some homogeneous ideal a. If b = Bi (a), then Lemma 10.17 
shows that 


Bo(V (6)) = V(a) M Bo(A”) = CN Bo(A") = C, 


and C is exhibited as fj of an affine algebraic set in A”. 


Corollary 10.19. If V is a quasi-affine variety in A”, then Bo(V) is a quasipro- 
jective variety in P”. Moreover, the geometric dimension of V as a quasi-affine 
variety equals the geometric dimension of Bo(V) as a quasiprojective variety. 


REMARKS. In other words, the closure Bo(V) is a projective variety. It is called 
the projective closure of the quasi-affine variety V. If V is actually an affine 
variety, then it has an associated prime ideal in A, and the projective variety Bo(V ) 
has an associated homogeneous prime ideal in A. The correspondence between 
the prime ideal in A and the homogeneous prime ideal in A will be examined 
shortly. 


PROOF. Because of the homeomorphism given by Proposition 10.18, Lemma 
10.10 as restated in the lemma’s remarks applies with Y = Bo(A”), X = P", and 
E equal to the closure of V in A”. The conclusion is that the closure of F in P” 
is a projective variety, and the first conclusion of the corollary is proved. The 
second conclusion is immediate from the version of Proposition 10.11 mentioned 
in the remarks with that proposition. 


To each index i with 0 < i <n, we can associate in a similar way a function 
B; : A” — P”. The formula for §; is 6;(x1,...,Xn) = [yo,---; Yn], where 
yj = xj41 for j <i, y; = 1, and y; = x; for j > 7. Just as in Proposition 10.18, 
under each f;, the Zariski topology of affine n-space A” coincides with the relative 
topology from P”. One consequence is that the notion of projective closure is 
meaningful if formed relative to any 6; in place of Bg. Another consequence is that 
P” has a covering by n + 1 open sets 6;(A”) that are each Zariski homeomorphic 
to A”. The functions 6; may be viewed as playing a role similar to the inverses 
of charts in the definition of a smooth manifold. 


Having used fp to associate a projective variety in P” to each affine variety in 
A” by passage to the topological closure, we turn to what happens with ideals. 
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Distinct homogeneous ideals in Acan map under £) to the same ideal in A; for 
example the principal ideals (1) and (Xo) in A both map to (1) in A. Theorem 
10.20 will show that we can associate a particularly nice ideal of A to each ideal 
of A in such a way that prime ideals of A correspond to those nice ideals of A 
that are prime. Under this correspondence the ideals for an affine variety and its 
projective closure will match. It will be apparent from the construction in the 
proof that the ideal of A is generated by all homogeneous polynomials F = F(f) 
of the form 
F(Xo, ..., Xn) = X¢f (X1/Xo,....,Xn/ Xo) 


whenever f + 0 is in the ideal of A and deg f = d. 


Theorem 10.20. As a mapping of ideals in A to ideals in A, Bo is one-one 
from the set Z of all homogeneous ideals a of A such that XpF € a implies F € a 
onto the set Z of all ideals of A. Under this one-one correspondence prime ideals 
correspond to prime ideals. 


PROOF. We are going to construct a two-sided inverse to the mapping induced 
by 8) from ideals in Z to ideals in Z. 

Let A<, be the k vector space of all members of A, including the 0 polynomial, 
of degree < d. The homomorphism £) carries Aa linearly into A<g, and it carries 
the basis of homogeneous monomials in A of total degree d onto the basis of all 
monomials in A of total degree < d. Thus fj : Ages A<q 1S one-one onto. 
Observe for any f in A<g that the formula 


F(X0,..., Xn) = X4 f(X1/Xo,---, Xn/ Xo) 


defines a member of Aq. If we write F = gy(f) when f and F are related in this 
way, then the function ga is a one-one k linear map from A<g into Aa such that 
Qa Boi is the identity on Ay. Because of finite dimensionality, Bos Ag —> Aza and 
ga: A<a > Ag are two-sided inverses of one another. 

Suppose that an ideal 6 in A is given. Define ag = gg(6M A<qa), and put 
a = @xo aa. According to remarks in the paragraph with the definition of 
homogeneous ideal, a is a homogeneous ideal if Gag C ag;- whenever G is in 
Ae. Define g= B(G). This polynomial has s deg g < e and ¢(g) = G, since 
Qe: Axe > Ae is a two-sided inverse OF B8 Ae —> Ace. If fisinbM A<g, then 
gf isin 6M A<a+e), and thus Gga(f) = gelg)~a(f) = Pate(gf) is iN da+e. 
This proves that a is a homogeneous ideal in A. 

Under the construction 6 +> a, let us see that a is in T. If XoF is in 
da+1, then we can write XoF = ga4i(g) for some g in 6M Acgi;. That 
is, XoF (Xo, ..., Xn) = X0t1 g(X1/Xo, ..., Xn/Xo). Then F(Xo,...,Xn) = 
X49(X1/Xo, ...,X,/Xo). This formula shows that g is in A<,g and that F = 
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ga(g). Hence F is in ag. In other words, the construction 6 +> a carries members 
of Z to members of Z. 
Under the construction 6 + a, the homogeneous ideal a has the property that 


Bo(a) = B5( B aa) = Y Boa) = YO (WN Aca) = 6. 
d=0 d=0 d=0 


Thus our construction starting from an ideal of A, passing to an ideal in the set 
T, and passing back to an ideal of A recovers the original ideal of A. 

Now suppose that a is in T. Put b = £4 (a). To see that the above passage to a 
member of Z recovers a from b, we are to show that 


an Ag = ga(6N Aza). (*) 
First we establish that 
Bi(a.M Aa) = By(a) ON Aca: (40%) 


The inclusion C in (**) is easy because 65 (aN Age Bo (a) and Bi (Aa) Cc Acg. 
For the reverse inclusion, let f be in (aM Ay) 1 A<q for some k. This means 
that deg f < d and that f = B)(G) withG € an Ax. Without loss of generality, 
we amy assume that k > d. Let F be the element F = @¢eg ¢(f) of Adeg fe 


Then Xp “8! F = g(f),and Bj(Xq °°! F) = Bige(f) = f = Bi(G). Hence 


ee ae f F and G are members of At with the same value under £}. Since £) is 


one-one on Ae G= Xe aes fr. Since G is in a and since the ideal a is in T F is 
in a. Hence the element beau, !F isin aN Ag, and it has poe oe fF) =f 
This proves the inclusion > in (**). Application of gg to both sides of (>) 
proves (*) and completes the proof of the first statement of the theorem. 

We are to show that prime ideals correspond to prime ideals. Let b in Z be 
prime, and let a be the ideal in Z with Bi (a) = 6. Let F and G be homogeneous 
elements in A of respective degrees d and e with FG ina. Then fg lies in b, 
where f = £)(F) and g = £j(G), and one of f and g lies in 6 because b is 
prime. Say f isin 6b. Then F = gg(f) lies in the right side of («) and hence lies 
in the left side. Consequently F is in a, and a is prime. 

Conversely let a in Tbe prime, and let b = 8)(a). Suppose that f and g are 
members of A with fg in b. Putd = deg f ande = deg g,anddefine F = gy(f) 
and G = ¢(g). Then FG = ase(fg) is in Ga+e(6M A<a+e), and (*) shows 
that FG is inaM Ag+. Since a is prime, one of F and G is in a. Say that F is 
ina. Then f = B)(F) is in 6, and 6 is prime. 
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Corollary 10.21. The inclusion Bo : A” — P” sets up a one-one correspon- 
dence between the prime ideals in A and those prime homogeneous ideals in A 
that do not contain Xo. 


PROOF. If a is a prime homogeneous ideal in A and XoF is in a, then either 
Xo or F is ina. If we can always exclude Xo from being in a, then F is in a, and 
the condition in the proposition for a to be in Z is satisfied. The rest follows from 
Theorem 10.20. 


Corollary 10.22. Let a be a prime homogeneous ideal of A not containing 
Xo, and let 6 = £j(a) be the corresponding prime ideal of A. Then the Zariski 
closure in P” of Bo(V (6)) is V (a). 


REMARKS. In other words, if an affine variety V has 6 as its ideal in A, then 
the projective closure of V has the corresponding a from Theorem 10.20 as its 
ideal in A. 


PROOF. Corollary 10.19 shows that Bo(V (b)) = V(a’) for some prime homo- 
geneous ideal of A. Since Bo(V (6)) C V(a) by Lemma 10.17 and since V (a) is 
closed in P”, V(a’) C V(a). Arguing by contradiction, suppose that the inclusion 
is strict. Applying 7(-) and using Proposition 10.12b, we obtain a’ > a. Since 
application of V(- ) to both sides of a’ D> ahas to yield a strict inclusion, we must 
have a’ = a. Choose G homogeneous in a’ that is not in a, and put f = jG. If 
(x1,...,%X,) is in V(b), then [1, x1,...,x,] is in By(V(6)) C V(a’), and hence 
f(%1,---,%) = GU,%x1,...,X%n) = 0. Thus f is in /(V(b)) = 6. Since 
deg f < degG, the construction of a from b in the proof of Theorem 10.20 
shows that F = @gegG(f) is in a. Then G and F are members of Agegg with 
Bi(G) = f = B)(F), and we obtain G = F,, contradiction. 


EXAMPLE. Twisted cubic from the example in Section 1 and Example 2 in 
Section 2. The prime ideal 6 C k[X, Y, Z] is (Y — X*, Z — X3), and we want 
to find the corresponding ideal a given by Corollary 10.21. Let the additional 
indeterminate in A be W. Applying 2 and ¢; to the respective generators Y — X? 
and Z — X? yields WY — X* and W*Z — X?. These must be in a. So must 

(W2Z = 20) = X(WY = X72) = WW xy) 
and K(W°Z — X*) = (WY FXO (WY = XO?) H WX ZH YL). 
Since we seek a prime ideal for a and W is not to be ina, WZ — XY and XZ —Y? 
are ina. Thus a D (WY — X*, WZ — XY, XZ — Y”). If ¢ denotes the ideal on 
the right, then a D ¢ and 
pic) = (Y —X?,Z = XY,XZ—=—Y?) 
=(Y-X?,Z-X?,XZ-x)=(¥ —-X?,Z-—X*) = pica). 
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To show that a = c, it is enough according to Theorem 10.20 to show that if F 
is homogeneous and W F is in c, then F is in c. The three generators of ¢ are all 
in Az, and thus ¢M Ag = Ag_2(¢M Az). Hence it is enough to show that ¢M Az 
contains no nonzero element divisible by W. Since ¢M Az consists of all linear 
combinations of the three generators, we can check this fact by inspection. The 
result is that a = c. Once we know a, we can compute the projective closure of the 
twisted cubic from Corollary 10.22. We find that it consists of all [w, x, y, z] of 
the form [1, x, x7, x3] together with [0, 0, 0, 1]. We might have guessed this form 
for the projective closure from the parametric realization of the twisted cubic in A? 
and from a passage to the limit, but proceeding in that fashion requires operations 
that we have certainly not justified. 
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We continue to assume that k is an algebraically closed field and to write A for 
k[X,,..., X,] and A for k[Xo,..., X,]. In this section we investigate certain 
classes of k-valued functions on quasiprojective varieties, specifically the “ra- 
tional” functions, the “regular” functions, and the local ring of functions regular 
at a particular point. For each kind of variety that we have introduced (affine, 
quasi-affine, projective, and quasiprojective), there are simple global definitions 
and there are complicated but equivalent local definitions for these notions. The 
complicated definitions have three advantages over the simple ones: they are 
virtually the same for all four kinds of varieties and therefore make it possible 
to work with all kinds of varieties uniformly, they make it possible in practice to 
construct a function by constructing only a local part of it, and they prepare the 
way better for a definition of isomorphism of varieties that does not insist on a 
particular dimension for the ambient affine or projective space. 

In this section we shall first give the simple definitions in the affine and 
quasi-affine cases and then prove results saying that certain more complicated 
local-sounding versions of these definitions amount to the same thing as the 
simple definitions. Then we shall give the simple definitions in the projective and 
quasiprojective cases. Finally we shall relate the quasi-affine and quasiprojective 
cases and show that certain more complicated local-sounding definitions in the 
quasiprojective case amount to the same thing as the simple definitions. 

We begin with affine varieties. Suppose that V = V(p) is an affine variety in 
A", p being a prime ideal in A. The affine coordinate ring of V is A(V) = A/p, 
which is an integral domain. Let us write the quotient homomorphism A — A(V) 
as at> a. Because of the Nullstellensatz, A(V ) can be identified with the ring of 
all restrictions of polynomials to V; in particular, a(P) is meaningful for every 
ae A(V)andP eV. 
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Proposition 10.23. If V is an affine variety in A”, then the points P of V are 
in one-one correspondence with the maximal ideals mp of the affine coordinate 
ring A(V), the correspondence being that m p is the maximal ideal of all members 
a of A(V) with a(P) = 0. 


PROOF. Each mp is a maximal ideal, being the kernel of a multiplicative 
linear functional. In the reverse direction, if m is a maximal ideal of A(V), 
then its inverse image in A under the homomorphism A — A/p = A(V) isa 
maximal ideal M of A containing p, by the First Isomorphism Theorem. The 
Nullstellensatz shows that M consists of all polynomials vanishing at some point 
P. Applying V(-) to the inclusion M > p gives {P} = V(M) C V(p) = V. 
Thus P isin V. 


Members of the field of fractions k(V) of A(V) are called rational functions 
on V, and k(V) is called the function field on V. Rational functions on V are 
not really functions on V in the traditional sense, since their denominators can 
vanish here and there. By way of compensation, an allowable denominator never 
vanishes identically; the reason is that the construction of a field of fractions 
of an integral domain does not involve using the zero element of the integral 
domain in a denominator. If f is a rational function on V and P is in V, one 
says that f is regular at P, or defined at P, if there exist a and b in A(V) 
with b(P) # 0 such that f = a/b. In this case, an equality @/b = a'/b’ 
with b(P) 4 0 and b'(P) # 0 implies that ab’ = a’b, from which we see that 
a(P)b'(P) = a@'(P)b(P) and that @(P)/b(P) = a'(P)/b'(P). Hence f(P) can 
be defined unambiguously as f(P) = a(P)/b(P). For P in V, the set of rational 
functions on V that are regular at P is a k algebra, as we see by carrying out the 
usual manipulations to add or multiply fractions. This k algebra is denoted by 
Op(V). Ithas A(V) C Op(V) CK(V). 

As in Proposition 10.23, let mp be the maximal ideal of all members a of A(V) 
with a(P) = 0. The localization of A(V) with respect to this maximal ideal is 
exactly Op(V). In fact, the localization is a subring of k(V) because A(V) is an 
integral domain. The members of Op(V) are exactly the quotients f = a/b with 
a and b in A(V) and with b not in mp. Hence Op(V) = S~!A(V), where S is 
the set-theoretic complement of mp. Thus Op(V) is the asserted localization. It 
has a unique maximal ideal and is called the local ring of V at P. 

A rational function is said to be regular on an open subset U of V if it is 
regular at every point of U. The regular functions on U form a k algebra denoted 
by O(U). In symbols the definition of O(U) is OU) = (1 pey Op(V). 

When A(V) is a unique factorization domain, the definition of regular at a 
point is simple enough to implement globally: we write f = a/b in some 
fashion, reduce the fraction to lowest terms, and then read off all the points P 
for which f is defined from the single expression of f as a quotient. Ordinarily, 
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however, A(V) is not a unique factorization domain, and then the definition is 
more subtle, as the following example shows. 


EXAMPLE. V = V(p) with p = (XW — YZ) andn = 4. The polynomial 
XW — YZ is irreducible, and thus V is an affine variety in A*. The affine 
coordinate ring is A(V) = k[W, X, Y, Z]/(XW — YZ). The quotient f = X/Y 
is arational function on V, since Y is not the 0 element of A(V), and the definition 
shows that f is regular at all points (w,x, y,z) of V having y 4 0. From 
X W—YZ=0,we have X/Y = Z/W, and thus f is defined also at all points 
(w,x,y,z) of V having w ~ 0. For example it is defined at the additional point 
(w,x, y,z)=(1, 0,0, 0). Actually, there exist no members a and bof A(V) with 
f =4G/b and b(w, x, y, z) # 0 whenever xw = yz and one or both of w and y 
are nonzero. The details are carried out in Problem 8 at the end of the chapter. 


The set of points P in the affine variety V at which a rational function f on V 
fails to be regular is called the pole set of /. 


Proposition 10.24. If f is a rational function on the affine variety V = V(p), 
then the pole set of f is the affine algebraic set V(a) C V(p) corresponding to 
the ideal a > p of all b € A such that Df isin A(V). 


PROOF. The set a in the statement is an ideal in A that contains p. Hence 
V(a) C V(p). If P is in V(p) and f is defined at P, then there are members a 
and b of A(V) with b(P) 4 0 such that bf = a; any representative of this b in A 
lies in a, and consequently P is not in V(a). Conversely if f is not defined at P, 
then no b such that bf is in A(V) has b(P) 4 0. That is, no member b of a has 
b(P) # 0. So P isin V(a). This proves that the pole set of f is exactly V(a). 


Corollary 10.25. If V = V(p) is an affine variety, then 
AV) = [) Op(V). 
PeV 
REMARKS. In the notation introduced above, the corollary says that A(V) = 
OV). 


PROOF. The inclusion € follows from the fact that A(V) C Op(V) for each 
P. For the reverse inclusion, suppose that f lies in (\pey Op(V). Then the 
pole set of f in V is empty. The pole set for f is the set V(a) for the ideal a in 
Proposition 10.24, and it follows from the Nullstellensatz that a = A. Then 1 is 
in a, and the definition of a shows that f is in A(V). 


If we consider the complement of the pole set of f, then we see from Propo- 
sition 10.24 that the subset of V at which f is regular is (relatively) open in V. 
Hence it is empty or dense in V. On the set where f is regular, f is continuous 
into A!, according to the following proposition. 
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Proposition 10.26. If a rational function f on the affine variety V is regular 
on the nonempty open set U of V, then it is continuous from U into A! with the 
Zariski topology (in which the proper closed sets are the finite sets). 


PROOF. It is to be proved that f—! of any finite subset of A! is relatively closed 
in U. Since the finite union of closed sets is closed, it is enough to consider 
f~'({c}) for an element c of k. This is the intersection with U of the pole set of 
1/(f — cc), which is relatively closed in U by Proposition 10.24. 


Now we can give the simple definitions in the quasi-affine case. Let the quasi- 
affine variety U in A” have closure the affine variety V. If f is a rational function 
on V, then Proposition 10.24 shows that f is regular on a nonempty open subset 
of V. Since the intersection of any two nonempty open subsets is nonempty, f 
is regular on a nonempty open subset of U. Therefore it is meaningful to view f 
as a rational function on U. We define the function field of rational functions 
on U to be the same as the function field of V: k(U) = k(V). The definition of 
regular function at P is the same for the quasi-affine variety U as for its Zariski 
closure V, and thus the local ring of U at P is given by Op(U) = Op(V). A 
rational function is said to be regular on the quasi-affine variety U if it is regular 
at every point of U. Since k(U) = k(V), the set of regular functions on U is the 
k algebra OU) = (lpey Op(U). 

The next step is to prove results saying that certain more complicated local- 
sounding definitions of the above notions amount to the same thing. 


Lemma 10.27. If V is an affine variety, then any two members of the affine 
coordinate ring A(V ) that are equal on a nonempty open subset of V are the same. 


PROOF. Subtracting, we may suppose that a € A(V) is 0 on the nonempty 
open subset U of V. By Proposition 10.26, a is continuous from V into A!. The 
complement of @~!({0}) has to be open in V and disjoint from U,, and therefore 
it is empty. So a@ is everywhere 0 and is the 0 element of A(V). 


Proposition 10.28. Let U be a nonempty open subset of the affine variety V 
in A”. Suppose that fo : U — k is a function with the following property: for 
each P in U, there exist an open subset W of U containing P and polynomials 
a and b in A such that b is nowhere vanishing on W and fo = a/b on W. Then 
there exists one and only one member f of k(V) such that f is regular on U and 
agrees with fo at every point of U. 


REMARKS. For the quasi-affine case the more complicated local-sounding 
definition of “regular function” on U, mentioned in the first paragraph of this 
section, is what is assumed of fo in the statement of this proposition. The 
proposition says that such an fp necessarily comes from a global rational function 
on V that is regular on U in the sense just above. 
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PROOF OF UNIQUENESS. If there are two such members of k(V), then sub- 
tracting them gives a member g of k(V) that is 0 on U. By definition of k(V), 
g = G/b with a and b in A(V) with with b 4 0. Then ad = gb is a member of 
A(V) that is 0 on U. By Lemma 10.27, a = 0 in A(V). Thus gb = Oink(V). 
Since k(V) is a field and b £0, g =0. 


PROOF OF EXISTENCE. If P is in U, then the hypothesis supplies some open 
subset W of U containing P and members a and b of A with b nowhere 0 on W 
and with fy = a/b on W. Let @ and b be the images of a and b in A(V). Since b 
is not identically 0 on U, b is not the 0 element of A(V). Therefore f=a /b is 
a well-defined member of k(V), and it is regular on W and agrees with fo there. 
If we start with another point P’ and an open subset W’ of U containing P’, then 
we similarly obtain f’ = a’/b’ in k(V) that is regular on W’ and agrees with 
fo there. The open subset WM W’ is nonempty, and a/b = a’/b’! on WOW’. 
Therefore b'a = ba’ on WM W’. By Lemma 10.27, b’a@ = ba’ as members of 
A(V). Dividing, we obtain f = f’. Since the member f of k(V) is regular on 
an open neighborhood of each point of U, it is regular on U. 


Proposition 10.28 allows us also to give a local-sounding definition of rational 
function and see that it reduces to the original definition. Specifically we consider 
pairs (Uo, fo) with Up nonempty open in the quasi-affine variety U and with fo 
satisfying the regularity condition on Uo in the proposition.!! Say that the pair 
(Uo, fo) is equivalent to the pair (U,, f) if fo = fj on Up NU. This relation is 
reflexive and symmetric. Let us see from the proposition why it is transitive. If 
(Uo, fo) is equivalent to (U1, f1), then the existence part of the proposition yields 
three members of k(V)—one for (Uo, fo), one for (UpNU1, fo) = (UoNU,, fi), 
and one for (U1, f1). The uniqueness part shows that the first two members of 
k(V) are equal and the last two are equal. Hence they are all equal. Now if 
(Uo, fo) is equivalent to (U,, f,) and (U;, f,) is equivalent to (U2, f2), then we 
routinely find that (Up N Uj, fo) is equivalent to (U; M U2, fx). From what we 
have just seen, (Uo, fo) is equivalent to (U2, f2), and the relation is therefore 
transitive. We could take the union of all the sets Up appearing in the pairs 
within an equivalence class and obtain the largest domain within U on which 
the rational function in question is regular. This notion for a rational function will 
not be too useful for us, but an analogous notion for rational maps in Section 6 
will be quite handy. 


In similar fashion the local ring O p(U) can be formulated in terms of “germs” 
of regular functions as follows. Fix P in U, and consider all pairs (Uo, fo) such 
that Uo is an open subset of U containing P and fo is a scalar-valued function on 


'I That is, for each P in Uo, there exist an open subset W of Uo containing P and polynomials a 
and b in A such that b is nowhere vanishing on W and fo = a/b on W. 
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Up satisfying the regularity condition on Up in the proposition.'* Say that (Uo, fo) 
is equivalent to (Ui, fi) if fo = fi on some open neighborhood of U containing 
P. It is easy to see that the result is an equivalence relation. An equivalence 
class is called a germ of regular functions at P. Germs inherit a natural addition, 
scalar multiplication, and multiplication, and the set of germs at P is therefore a 
k algebra. The use of germs is the traditional device in mathematics for isolating 
local behavior of functions in arbitrarily small neighborhoods of points. 


Corollary 10.29. Let U be a nonempty open subset of the affine variety 
V in A”, and let P be in U. To each germ {(Uo, fo)} of regular functions 
at P corresponds one and only one member f of k(V) that is associated via 
Proposition 10.28 to each pair (Uo, fo). Moreover, this correspondence is a k 
algebra isomorphism of the ring of germs onto the local ring Op(U). 


PRrooF. If (Uo, fo) and (Uj, fy) are two pairs in a germ at P, then the definition 
of germ gives a pair (W, go) such that W is a neighborhood of P contained in 
Up M Uj and g agrees with fo and fj on W. Proposition 10.28 supplies unique 
members f, f’, and g of k(V) such that f is regular on Up and agrees with fo 
there, such that f’ is regular on Uj and agrees with fj there, and such that g is 
regular on W and agrees with gp there. The uniqueness in the proposition shows 
that f = g and that g = f’. Therefore f = f’. So we have a well-defined map 
of germs into k(V). 

The image f of the pair (Uo, fo) is a member of k(V) that is regular on Uo, 
hence is defined at P. Thus the map on germs is into Op(U). It is ak algebra 
homomorphism because of the definitions of the operations on germs. If the germ 
of (Uo, fo) maps to 0, then fo is the 0 function on Uo, and any representative 
(W, go) of the germ with W C Up has go equal to the 0 function on W. Thus the 
germ is the 0 germ, and the k algebra homomorphism is one-one. Finally if f isa 
member of Op(U), then f = @/b with and b in A(V) and with b nonvanishing 
at P. By Proposition 10.26, b is nonvanishing on some open neighborhood Up 
of P. Then the germ of (Uo, fo) maps to f if fo is defined as the restriction of 
a/b to Up. Therefore the k algebra homomorphism is onto Op(U). 


This completes the discussion of the definitions in the cases of affine and quasi- 
affine varieties. Next we consider projective varieties, beginning with the simple 
definitions. Let V = V(p) be a projective variety, p being a prime homogeneous 
ideal in A different from €)j., Aa. The integral domain A(V) = A/p is called 
the homogeneous coordinate ring of V. Since p is homogeneous, we can write 
A(V) as 


Av) = ® 4a/(4anp) = @ AW)a. 
d=0 d=0 


!2See the previous footnote. 
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Let us write the quotient homomorphism A A(V) as F > F. We say that F 
is homogeneous of degree d if it lies in A(V)g = ry i (Aa Mp). 

Despite Proposition 10.12, homogeneous members of A(V) do not yield well- 
defined functions on V, and we cannot simply imitate the affine case in defining 
the function field of V. The function field k(V) of V is a certain proper subfield 
of the field of fractions of A(V), namely the set of all quotients F/G with F and G 
homogeneous of the same degree and withG + 0. If the common degree of F and 
G is d, then the quotient F/G is homogeneous of degree 0 in (xo, ..., Xn) and is 
therefore well-defined on the equivalence class [xo, ..., Xn] in P”. Such quotients 
form a field because if F ; and G,; are homogeneous of degree d and F', and G2 are 
homogeneous of degree e, then F')/G; + F2/G2 = (F1 G2 + Gi F2)/(Gi G2) 
and (F, F)/(G; Gz) are each the quotient of two members of A(V) that are 
homogeneous of degree d + e, the denominator not being the zero element, and 
because the inverse of F/G is G/F. Elements of k(V) are called rational 
functions on V. e 

Although the values of homogeneous members of A are not meaningful on 
P", the zero locus of such a polynomial is well defined. If F is a member of 
the quotient A(V) homogeneous of degree d, then its set of preimages in Ad 
is F + (Ag Mp). The members of Ag p all vanish at every point of V, and 
therefore whether F vanishes at a point P of V depends only on the coset of F in 
A(V). Accordingly, a member / of k(V) is said to be regular at the point P = 
[xo,...,X,] of V,or defined at P,if h can be written as a quotient h = F/G of 
homogeneous members of A(V ) of the same degree in such a way that G(P) £0. 
In this case, h(P) is well defined as the quotient F (xo, ...,%n)/G(xo0,.--, Xn) 
for any (x0, ..., Xn) representing the point P = [xo,..., Xn]. 

The set of points P in the projective variety V at which a rational function h 
on V fails to be regular is called the pole set of h. The proof of the following 
result is similar to the proof of Proposition 10.24 and is therefore omitted. 


Proposition 10.30. If h is a rational function on the projective variety V = 
V(p), then the pole set of h is the projective algebraic set V(a) C V(p) corre- 
sponding to the homogeneous ideal a > p generated by all homogeneous G € A 
such that Gh is in A(V). 


As in the case of affine varieties, the set of members of k(V) regular at P in V 
is ak subalgebra of k(V) called the local ring of V at P and denoted by Op(V). 


Corollary 10.31. If V = V(p) is a projective variety, then 


k= () Op(V). 


PevVv 
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REMARKS. The classical prototype of this corollary is that a rational function 
without poles on the Riemann sphere is constant. A direct proof of this fact for the 
Riemann sphere in the style of this book follows by applying Proposition 6.9 to 
the sum of the given rational function and any constant function. A generalization 
appears as Corollary 9.4. 


PROOF. The inclusion C is automatic. For the reverse inclusion, suppose that 
the rational function h on V lies in ()p-y Op(V). Then the pole set of h in V 
is empty. The pole set for h is the set V(a) for the ideal a in Proposition 10.30, 
and it follows from the homogeneous Nullstellensatz (Proposition 10.12a) that 
An © afor all N sufficiently large. For any such NV, A(V)y h lies in A(V). It is 
homogeneous of degree N and hence is in A(V) y. Iterating this inclusion gives 


A(V)vh' CA(V)y forall k > 0. (x) 


Since V is nonempty, some X; is not in p; to fix the notation, let us suppose 
that Xo is not inp. Then Xo 0. Inclusion (*) shows that Xo vk lies in A(V) 
for all k > 0. Thus ne lies in the subset Xo _ NAV) of the field of fractions of 
A( V), and the ring A(V)[A, given by the substitution homomorphism X +> h 
applied to the polynomial ring A(V)[X I, is exhibited as an A(V) submodule of 
the finitely generated A(V) module Xo _ NA(V) of the field of fractions of A(V). 

Since A(V) is Noetherian as a homomorphic image of A, A(V)[A] i is a finitely 
generated A(V) module. By Proposition 8.35 of Basic Algebra, h is a root of 
some monic polynomial in A(V)[X]. Say that h satisfies 


Ai +c yh! ++.» +eh +9 =0 


with each c; in A(V). Decomposing each term into homogeneous parts and 
equating to 0 the sum of the terms homogeneous of degree 0 shows that we can 
assume each c; to be in A(V)o = k. That is, we may assume that h is algebraic 
over k. Since k is algebraically closed, h is in k. 


If we consider the complement of the pole set of h, then we see from Proposition 
10.30 that the subset of V at which h is regular is open in V. Hence it is empty 
or dense in V. On the set where h is regular, h is continuous into A!, according 
to the following proposition, whose proof is the same as for Proposition 10.26. 


Proposition 10.32. If a rational function h on the projective variety V is 
regular on the nonempty open set U of V, then it is continuous from U into A! 
with the Zariski topology (in which the proper closed sets are the finite sets). 
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The procedure for extending the above remarks from projective varieties to 
quasiprojective varieties is the same as for extending the earlier remarks from 
affine varieties to quasi-affine varieties. Let the quasiprojective variety U in P” 
have closure the projective variety V. If fA is a rational function on V, then 
Proposition 10.32 shows that h is regular on a nonempty open subset of V. Since 
the intersection of any two nonempty open subsets is nonempty, / is regular on 
a nonempty open subset of U. Therefore it is meaningful to view h as a rational 
function on U. Thus we define the function field of U to be the same as the 
function field of V: k(U) = k(V). The definition of regular function at P is 
the same for the quasiprojective variety U as for its Zariski closure V , and thus 
the local ring of U at P is given by Op(U) = Op(V). A rational function is 
said to be regular on the quasiprojective variety U if it is regular at every point 
of U. The set of regular functions on U is ak algebra denoted by O(U). Thus 


OU) = () Op(U). 


PcU 


For the special case that U = V, Corollary 10.31 shows that O(V) reduces to the 
constants. 


The next step is to check that the simple definitions in this section in the affine 
and quasi-affine cases are consistent with the simple definitions in the projective 
and quasi-projective cases. Proposition 10.18 and Corollary 10.19 tell us the 
extent of the overlap—that any of the mappings 6; : A” > P" withO < j <n 
allows us to identify any quasi-affine variety with a quasiprojective variety. Thus 
what we need to show is that the definitions of function field, functions regular at 
a point, and functions regular on a variety amount to the same thing for a quasi- 
affine variety U and for the quasiprojective variety 6;(U). For concreteness we 
shall take j = 0. 

Corollaries 10.21 and 10.22 tell us exactly what we are to compare. The prime 
ideals a of A not containing Xo are in one-one correspondence with the prime 
ideals 6 of A, the correspondence being b = £)(a), and the Zariski closure of 
V (£5 (6)) in P” is V(a). The correspondence does not yield a natural map of 6b 
into a. Instead, the system of linear mappings gy : A<qg > Aa given by 


F (Xo, way Xp) = gal f)(Xo, ..+, Xn) = X4 f (X1/Xo, ..+,Xn/Xo) 


is a system of inverses to the system of restrictions Bb : Ag —> Ag«a of the 


°s la, 
homomorphism fj : A > A given by 


FOS cS BA NO aah Op) = FG Xgsca oa 
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and these systems have the properties that 
aNAg=ga(bN Aca) and = Bi(aN Ag) = BN Aca. 


Proposition 10.33. Let a prime ideal a of A not containing Xo correspond to 
the prime ideal b of A under the formula 6 = 8) (a) as in Theorem 10.20, and let 
U = V(b) and V = V(a) be the respective affine and projective varieties for b 
and a, V being the Zariski closure of By(U) in P”. Then £j descends to a ring 
homomorphism w of A(V) onto A(U), and y in turn induces a canonical field 
isomorphism VY : k(V) — k(U). Under the field isomorphism W, the image of 
the local ring Og,(p)(V) is Op(U) for each P in U. 


PROOF. Since Bj carries A onto A and carries a into b, Bj descends to a 
homomorphism y of A/a = A(V) onto A/b = A(U). If F and G are in 
the same homogeneous summand AWV)a of A(V), then we define U(F/G) = 
Wr (F)/W(G) as a member of the field of fractions k(U) of A(U). If F/G = 
FIG: then FG =F G. Applying w, using that y is a homomorphism, and 
reinterpreting matters in k(U), we see that V(F/G) = W(F/G), i.e., that V is 
well defined. A similar argument that involves clearing fractions and applying y 
shows that Y respects addition and multiplication. Therefore WV is a field mapping 
of k(V) into k(U). 

Let A(U) <a be the image of A<g in A/b = A(U). Since Bj carries Ag onto 
A<q and carries aM Aa onto 6M Acq, w carries AWV)a onto A(U)<g. Any 
member of k(U) is the quotient of two members of A(U)<g for some d, and 
it is consequently W of the quotient of the corresponding members of A(V)q. 
Therefore W carries sk(V) onto k(U) and is a field isomorphism. 

Let F and G in A(V) be the cosets F + aand G+a,let f = Bi(F) and g = 
B)(G), and let f and gin A(U) be the cosets of f +6 and g+b. Then y(F) = f 
and w(G) = g,andhence UW (F/G) = f /&. Let P = (x1,...,X,) beinU,so that 
BoP). = [1y 21,02 sq] 8 in Bo()< Define BF (PF) = Cy Xigs is 3%_) ATT, 
so that the class of BCP) in P” is Bo(P). Then g(P) = g(P) = (fiG)(P) = 
G(Bix(P)) = G(pi(P)). Therefore f/2 lies in Op(U) if and only if F/G lies 
in Og py(V). So W carries Og,(p)(V) onto Op(U). 


Corollary 10.34. Let V be a projective variety, and let U be a nonempty open 
subset of V. Then each member of O(U) € k(V) is determined as an element 
in k(V) by its restriction to U. 


PROOF. Subtracting two such members, we may assume that their difference 
hisOon U. We are to prove that h = 0 in k(V). For some j withO < j <n, 
B;(A”) N V is nonempty, and we may assume that this is the case for j = 0. 
The subset Vo = Bo. '(V) of A” is an affine variety. Since U and Bo(A”) N V are 
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nonempty open subsets of V, their intersection is nonempty, and Up = By '(U) is 
a nonempty open subset of Vo. Let YW : k(V) — k(Vo) be the field isomorphism 
in Proposition 10.33. By assumption, h is in Og,:p)(V) for every P in Up. Since 
the value of h at P is 0, h is actually in the maximal ideal of Og,(p)(V) for P 
in Up. Proposition 10.33 shows that (/) is in the maximal ideal of O p(Vo) for 
all P in Up. Fix Po in Up. Then we can write the member V(h) of k(Vo) as 
W(h) = a/b with b( Po) 4 0. Since b is continuous on Vo by Proposition 10.26, 
b(P) is nonzero for all P in some neighborhood W of Po contained in Up. Then 
the formula Y(h) = @/b shows explicitly that W(h) is defined at such points 
P and satisfies U(h)(P) = a(P)/b(P). Since W(h) is in the maximal ideal of 
Op(Vo) for all P in Up, Y(h)(P) = 0 for P in W. Hence a(P) = 0 for P in 
W. Consequently a and 0 are two members of A(V) that are equal on W, and 
Lemma 10.27 allows us to conclude that a = 0. Therefore h = 0. 


Proposition 10.35. Let U be anonempty open subset of the projective variety 
V in P”. Suppose that ho : U — k isa function with the following property: for 
each P in U, there exist an open subset W of U containing P and homogeneous 
polynomials F and G in A of the same degree such that G is nowhere vanishing 
on W and ho = F/G on W. Then there exists one and only one member h of 
k(V) such that h is regular on U and agrees with ho at every point of U. 


REMARKS. For the quasiprojective case the more complicated local-sounding 
definition of “regular function” on U, mentioned in the first paragraph of this 
section, is what is assumed of ho in the statement of this proposition. The 
proposition says that such an fg necessarily comes from a global rational function 
on V that is regular on U in the sense just above. 


PROOF. For each j with O < j <n such that V; = 6;(A”) M V is nonempty, 
B iy (V;) is an affine variety, and U; = UM V; isa nonempty open subset such that 
hio = ho| U; is a function on U; with the following property: for each P in U,, 
there exist an open subset W of Uj containing P and homogeneous polynomials 
F and G in A of the same degree such that G is nowhere vanishing on W and 
hjo = F/G on W. We pull back this situation by ae writing Bi hijo for the 
function on B,'(W) given by (B;hj.0)(Q) = hj,o(Bj(Q)). The set B,'(Vj) is an 
affine variety, and the Zariski closure of V; in P” is V. The homomorphism B; on 
A descends toa ring homomorphism y; : A(V) > A(po! (V;)), and w; induces 
a field isomorphism W; : k(V) > k(B>'(V;)), according to Proposition 10.33. 

The set 6, '(U;) is a nonempty open subset of the affine variety 6; '(V;), 
and Bi hijo is a function on BX j) with the following property: for each P in 
B (U;), there exist an open subset W of B i (U;) containing P and homogeneous 
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polynomials F and G in A of the same degree such that their images F and G 
in A(V) have G nowhere vanishing on W and have Bihj.o = VW (F)/Wi(G) = 
W(F/G) on W. Proposition 10.33 says that wy; (F) = a and Wi (G) = b for 
members G and b of A(p ia (V;)). We are in the situation of Proposition 10.28 with 
fo= Bihjo, and that proposition produces a unique member h; of k(8;_ '(V;)) 
that is regular on B,'(Uj) and agrees with Bihj.o at every point of Be (Uj). 
The member h of k(V) that we seek is h = Ue ‘(nh ;). To verify this assertion, 
we are to show that VW; ‘(h ;) is independent of j. Thus suppose that V; 1 V; 4 ©. 
Fix P in U; NU; = UNV; V;, and choose the above open neighborhood W 
of P small enough for the above construction to apply for both indices i and j. 
By the uniqueness in Proposition 10.28, 4; is the unique member of k(6; I (Vj) 
that is regular on B,'(W) and agrees with Bihj.0 = B; (ho|y,) at every point of 
B,'(W). Thus W;! (hj) = F/G on W, where F and G are as in the previous 
paragraph. By the same uniqueness argument, Wo! (hi) = F/G on W. The 
difference Wo! (hi) — Ww; | (hj) is a member of k(V) that is regular on W and 
vanishes there. By Corollary 10.34, the difference is 0 as an element of k(V). 


Therefore wi! (h;) is independent of j, and we can take h to be this member of 
k(V). 


Just as in the quasi-affine case, it is possible in the quasiprojective case to give 
a local-sounding definition of rational function and a formulation of Op(U) in 
terms of germs. We shall not use these notions, and we omit any further discussion 
of them. 


5. Morphisms 


The goal of this section and the next is to introduce maps that make the collection 
of all quasiprojective varieties over an algebraically closed field k into the objects 
of a category in a way that does not depend on the ambient space A” or P” of 
the variety. These maps will all be algebraic in nature, and there will be two 
choices of which class of maps to use, one involving good denominators and 
one allowing occasional bad denominators. The first kind of map will be called 
a “morphism,” and the second kind of map will be called a “dominant rational 
map.” The relationships between these two kinds of maps and the interpretation 
of these maps in terms of function fields will be of great importance in applying 
this theory. 

A variety over the algebraically closed field k henceforth will be any affine, 
quasi-affine, projective, or quasiprojective variety as in the previous sections. To 
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each such variety V, Section 4 associates a function field k(V), a local ring 
Op(V) C k(V) of regular functions at each point P, and a ring O(E) = 
Oper Or(V) € K(V) of regular functions on each nonempty open subset E 
of V. We have observed that each rational function on a variety V is regular on 
some nonempty open subset of V, namely the complement of the pole set. One 
further fact that we shall use about rational functions is the following. 


Proposition 10.36. If P and Q are distinct points of a variety V, then there 
exists a rational function h € k(V) such that h is defined at both P and Q, has 
h(P) = 0, and has h(Q) £0. 


PROOF. Without loss of generality, we may assume that V is eae Say 
that VE IP", Let p be the prime homogeneous ideal in A = k[Xo, . 2s Xn] such 
that AW) = A/p, and let F +> F be the quotient homomorphism A +> A(V). 
Let P = [Xo0,..-, Xn] and Q = [yo, ..., Yn]. Choose a homogeneous polynomial 
F in A such that F(xo,...,X,) = O and F(yjo,..., ¥n) 4 O, and choose a 
homogeneous polynomial G with degG = deg F such that G(xo,...,x,) #0 
and G(yo,.--, Yn) # 0. Then G is not 0, andh = F/G has the required 
properties. 


If U and V are varieties, then a continuous function g : U — V relative to the 
Zariski topology is called a morphism if for each nonempty open subset FE of V 
and each regular function f on E, the composition f o ¢ is a regular function 
on the open subset g~!(E) of U. Thus ¢ is to be continuous and is to induce by 
composition a function from O(E) into O(y7!(E)) for each open subset E of V. 
An isomorphism of varieties is a morphism having an inverse function that is a 
morphism. 

It is immediate that the composition of two morphisms is a morphism and that 
the identity function is a morphism. Thus the varieties over k form a category if 
morphisms are used as the maps. 


EXAMPLES OF MORPHISMS. Suppose that k has characteristic different from 2. 
Let U be P', written as 


P! = {Is, 41] (s,1) 4,0}, 


and let V be the projective variety in P? defined by the irreducible homogeneous 
polynomial X? + Y? — Z?,ie., 


= {[x, yz] |x? +y* =z’ and (x, y,z) # O,0,0)}. 
Let g : U — V be the function given by 


g(Is, t]) =[s? — 27, 2st, s* +27]. 
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This is well defined, and it is continuous because the Zariski closed proper subsets 
of V are the finite sets, whose inverse images are finite sets. If F and G are 
two homogeneous members of k[X, Y, Z] and if F and G are the images in 
A(V) = kX, Y, Z]/(X* + Y? — Z), we are to assume that G is not 0, i.e., that 
G is not divisible by X* + Y* — Z*, and then h = F/G is a typical rational 
function on V. We are to show that if h is regular on an open subset E of V, then 
hog is regular on g~'(E) C P!. The expression h = F/G exhibits h as regular 
on the open set F of points [x, y, z] of V with G(x, y, z) #0. The set yg \(E) 
is the set of points [s, t] in P! with G(s? — t?, 2st, s* +17) £0. At such points 
the function h o @ is given by 


(ho g)(s,t) = F(s* —t, 2st, s* + t?)/G(s* — 27, 2st, s* +17), 


and it is given by a rational expression with nonvanishing denominator. Thus @ 
is a morphism. 
Let us see that y : V — P! given by 


[Ix +z, y] if [x, y,z] #[1,0, —1], 


wix,y,z]= pe a if [x,y,z] 4 [1,0, 1] 


consistently defines another morphism. For the consistency we observe that 
x? + y* = 2? implies that (x + z)(x —z) = —y?; hence on the common domain 
of the two expressions, [x + z, y] = [—y?/(x — z), ¥y] = [-y/@ -2z), J = 
[—y, x — z]. Continuity of y follows because the inverse image of any finite set 
is a finite set. For the regularity we observe that if F and G are homogeneous 
members of the same degree in A(P!) = k[S, T] with G # 0 and if h = F/G, 
then the expression h = F/G exhibits h as regular on the open set F of points 
[s, t] in P! with G(s, r) + 0. The set Ww! (EB) is the set of points [x, y, z] on V 
with G(x + z, y) £0. At such points the function h o y is given by 


(how)[x, y,z] = F(x +z, y)/G(x +z, y), 


and it is given by a rational expression with a nonvanishing denominator. Thus 
w is a morphism. In other words, ¢g is an isomorphism. 


Proposition 10.37. Let Bo : A” — P” be the usual inclusion. If U is a 
quasi-affine variety in A”, then fp is an isomorphism of the quasi-affine variety 
U onto the quasiprojective variety Bo(U). 


PROOF. Proposition 10.18 shows that Bo is a homeomorphism of U onto its 
image. The last conclusion of Proposition 10.33 implies that the regular functions 
for U match those for By(U) under fo, and the result follows. 
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Theorem 10.38. Let U be any variety, let V be any affine variety, and let 
A(V) be the affine coordinate ring of V. Then the morphisms g : U — V are in 
one-one correspondence with the k algebra homomorphisms g : A(V) > O(U) 
via the formula 


O(f)= fog for f € A(V). 


REMARKS. Members f of A(V) lie in O(V). The k algebra homomorphism 
@ is meaningful because the fact that y is a morphism implies that f o q is in 
O(y7!(B)) for every open F in V; here we take E = V and gy '(E) =U. The 
proof of Theorem 10.38 will be preceded by a lemma. 


Lemma 10.39. If U is a variety and V is an affine variety in A”, then a function 
w :U — V isa morphism if and only if X; o w is a regular function on U for 
the image X; in A(V) of each coordinate function X; with 1 <i <n. 


PROOF. If w is a morphism, then the definition of morphism forces X; 0 yy to 
be a regular function. 

Conversely suppose w has the property that each X; o w is a regular function. 
Then f o w is aregular function on U for each f in A(V), since every member 
of A(V) is a polynomial in the elements X;. If E is a closed set in V, then E is 
the locus of common zeros of some set { fy} of polynomials, and w~!(E) is the 
set of points P such that f,(w(P)) = 0 for all w. Hence y~!(E) is the locus of 
common zeros of a subset { fy o w} of regular functions on U and is relatively 
closed in U. Thus y is continuous. 

If E is nonempty open in V, then k(E) = k(V) shows that each regular 
function h on E is locally the quotient of members of A(V) with nonvanishing 
denominator. Let us write h = f/g with g nonvanishing near a point of interest. 
Then how = (fo W)/(g © &) is exhibited locally as a rational function with 
nonvanishing denominator. 


PROOF OF THEOREM 10.38. Suppose that a : A(V) > O(U) is ak algebra 
homomorphism. Define y : U > V by w(P) = (a@(X1)(P),...,@(X,)(P)). 
Then X; 0 W = a@(X;) is in O(U) by definition of a, and Lemma 10.39 shows 
that yw is a morphism. 

The k algebra homomorphism v defined by v( f) =f 0° has U(X) = 
X;0 v= a(X;). Since the elements Xj generate A(V), Y = a. Thus starting 
from a, forming y, and obtaining y recovers a. In the reverse direction if we 
start from g, form @, and use the construction of the previous paragraph to obtain 
w, then W(P) = (GX1)(P), ..., (RP) = (Xi(Y(P)), -.- Xn(Y(P))) = 
y(P) for PinU. Hence y = g. Thus the functiona t+» w is a two-sided inverse 
of the function g  @. 
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Corollary 10.40. If U and V are affine varieties, then the morphisms 
g : U — V are in one-one correspondence with the k algebra homomorphisms 
g@ : A(V) > A(U) via the formula 


G(f)=foge for f € A(V). 


PROOF. This is immediate from Theorem 10.38, since Corollary 10.25 shows 
that O(U) = A(U). 


Proposition 10.41. If U and V are varieties andifg : U — Vandy: U > V 
are morphisms such that g| p= | , for some nonempty open set E in U, then 
g=y¥. 

PROOF. Let / be a rational function on V, and let E’ be the nonempty open 
subset of V on which / is regular. Since g and w are morphisms, hog andhoy 
are regular on the respective nonempty open subsets g~!(E’) and w~!(E’) of U. 
The equality 9| p= | p Shows that h o g and h o w are equal on the nonempty 
open subset ENg7!(E')Nw7!(E’) of U. The function hog — how is therefore 
a rational extension from E 1 g~!(E') N w7!(E’) to U of the 0 function, and 
Proposition 10.34 shows thathog—hoy =OonU. Thereforehog=how 
as elements of k(U) for every h in k(V). 

Arguing by contradiction, suppose that P is a point in U for which g(P) # 
w(P). Then Proposition 10.36 produces h in k(U) such that A is regular on 
an open subset F of V containing g(P) and w(P) and has h(g(P)) = 0 and 
h(w(P)) 4 0. Since g and y are morphisms, / o g andh o w are regular on the 
open set go \(F) ON wil(F). Their respective values at P are h(g(P)) = 0 and 
h(wW(P)) £0. Since h og =ho was rational functions, this is a contradiction. 


Proposition 10.42. Suppose that U and V are varieties and that pe: U > V 
is a morphism. If P is in U, then @ induces a k algebra homomorphism 
gp : Ogp)(V) + Op(U). Composition of morphisms goes to composition 
of these homomorphisms in the reverse order. 


Proof. Propositions 10.33 and 10.37 together imply that we may assume U and 
V to be quasi-affine. Let f in k(V) be defined at g(P). Proposition 10.24 shows 
that the set E on which f is regular is open in V. Since g is a morphism and f is 
regular on E, f og is regular on the open subset g~!(E) of U. Proposition 10.28, 
applied to g~'(E) C U, shows that there exists a unique member F of k(U) that 
is regular on g~!(E) and agrees with f o yg on gy !(E). We put gp(f) = F. 
It is a routine matter to check that y% is a k algebra homomorphism and that 
compositions go to compositions in the reverse order. 
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6. Rational Maps 


This section will introduce a second kind of map that makes the collection of all 
(quasiprojective) varieties over the algebraically closed field k into a category. 
These maps will not be ordinary functions, and the definition requires some care. 

If U and V are varieties over the algebraically closed field k, then a rational 
map g : U — V isan equivalence class of pairs (FE, gz), where E is a nonempty 
open set of U and ¢¢ is a morphism of E into V. The equivalence relation on two 
such pairs is that (E, gz) ~ (E", ve )if ve | Eng! = GE’ | ate This is meaningful, 
since the intersection of any two nonempty open sets is nonempty. The relation 
~ is certainly reflexive and symmetric, and Proposition 10.41 shows that it is 
transitive. We can therefore take the union of the open subsets F such that some 
pair (E, gz) is in the equivalence class, and ¢ will be definable as a morphism on 
this union. This union is called the largest domain on which ¢ is a morphism. 

A morphism from U to V defines a rational map. But a rational map need not 
be an everywhere-defined function, and forming the composition of two rational 
maps is problematic. For example, if E is the open subset of U on which a rational 
map g : U — V is defined and F is the open subset of V on which a rational 
map yf : V — W is defined, then it may happen that g(£) is disjoint from F’. In 
this case the composition Ww o g makes no sense. 

A rational map yg : U — V is said to be dominant if gz has dense image in 
V for some (and hence every) pair (EZ, gz) in the equivalence class. It is evident 
that the composition of two dominant rational maps makes sense as a rational 
map. The identity mapping is a dominant rational map, and thus the collection 
of all varieties over k becomes a category if the dominant rational maps are used 
as the maps of the category. 

A birational map is a dominant rational map g : U — V that has a dominant 
rational map wy : V — U as a two-sided inverse. Two varieties admitting a 
birational map from the one to the other are said to be birationally equivalent 
varieties, or to be birational. 


EXAMPLE. The irreducible affine plane curves defined by T? — (S* + 1) and 
Y? —(X?—4X) are birationally equivalent if k has characteristic different from 2. 
Birational mappings in the two directions are given by 


Y 2 
een X= 
2X T — S? 
5 and ns 
Y 8X 
ees Poe. 
4x2 T — S? 


The rational map from (X, Y) to (S, 7) is a morphism on the complement of 
(0, 0) in the locus y? = x? — 4x in A?. The rational map from (S, T) to (X, Y) 
is a morphism on the entire locus t? = s* + 1 in A’. 
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Let g : U > V bea dominant rational map, and let (£, gz) be any pair in the 
equivalence class g. If f € k(V) is a rational function on V, then the subset F of 
V on which f is defined is open and nonempty. So f | is is aregular function on F’. 
Since gg is continuous and has dense image, E’ = Ge (F) is anonempty open 
setin E C U. The function yz is a morphism from E’ into F, and thus f | p OPE! 
is a regular function on E’. We can therefore regard it as a rational function on 
U,i.e.,a member of k(U). Consequently the dominant rational map g : U > V 
induces a function @ : k(V) — k(U) that is easily seen to be a field mapping 
respecting k. Compositions of dominant rational maps lead to compositions of 
such field mappings in the reverse order. 


EXAMPLE, CONTINUED. The two irreducible affine plane curves in the example 
earlier in this section have been observed to be birationally equivalent. In view 
of the previous paragraph, their function fields must be isomorphic. Taking into 
account that the genus of a curve, as defined in Section IX.3, depends only on the 
function field, we see that the two curves must have the same genus. This equality 
is confirmed by Example 3 of genus in Section IX.3, which shows that the genus 
of k[x, y]/(y — p(x)), where p(x) is a square-free polynomial of degree m in 
characteristic different from 2, is sm — 1ifm is even and is s(m — 1) if mis odd. 
The two curves under study have m = 4 and m = 3, and the genus is | in both 
cases. 


The main result of this section will be a converse to the construction just made, 
showing how to pass from a k algebra homomorphism between function fields to 
a dominant rational map in the reverse order. We require two lemmas. 


Lemma 10.43. Let V = V(f) be the hypersurface!? in A” defined by a non- 
constant polynomial f ink[X1,..., X,]. Then the open set A” — V is isomorphic 
to an affine variety, specifically to the hypersurface in A”*! corresponding to the 
irreducible polynomial Xy4) f(X1,..., Xn) — Link[Xy,..., Xnqu]. 


REMARKS. Eventhough f is not assumed irreducible, X,,,1 f —1is irreducible. 
In fact, consideration of the degree in X,,; shows that the only possible nontrivial 
factorization is of the form (X,+41a — b)(c) with a, b,c in k[Xj,..., X,]. Then 
bc = 1, and c has to be scalar. The open set A” — V is a quasi-affine variety 
(having closure A”), and the lemma therefore asserts that this quasi-affine variety 
is isomorphic to a certain affine variety in A”*!, 


PROOF. Let W = V(Xn41f — 1). Let ge : W — A” be the map defined by 
P(X1, +++ Xng1) = (1, -.-, Xp) for x1, ..., Xn41) in W. Then X jo@ is projection 


'3Tn the application of Lemma 10.43 to Lemma 10.44, it is important that the polynomial f is 
allowed to be reducible. 
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to the j" coordinate for 1 < j <n, which is a regular function on W. Lemma 
10.39 shows that g is a morphism, and ¢ is one-one onto by inspection. The 
inverse function is given by g~!(x1,...,X)) = (a, very Xn lf Oayas a) 
Let X; be the image of X; in k[X,,..., Xn4il/(Xnuif —Dforl<j<n+l. 
Then (Xj 0 OV Ra) equals x; for j < n and equals 1/f(x,..., X,) for 
j =n-+1, and these are regular functions on the complement of V(f) in A”. 
By Lemma 10.39, g! is a morphism. 


Lemma 10.44. If V is a variety, then there is a base for the Zariski topology 
on V consisting of open sets that are isomorphic to affine varieties. 


PROOF. Let P be in V, and let U be an open subset of V containing P. 
We are to produce an open subset W of U containing P that is isomorphic to 
an affine variety. Since any nonempty open set of a quasiprojective variety is 
a quasiprojective variety, U is a variety. Thus we may assume that U = V. 
Since any projective variety in P” is covered by the affine varieties isomorphic 
via Proposition 10.37 to nonempty intersections with 6;(A”), any quasiprojective 
variety is covered by quasi-affine varieties. Thus we may assume that U = V 
is quasi-affine in A”. Let X be the closed subset X = V — V in A”, and let 
a = 1(X). Since P is in V, it is not in X, and there exists some f in a with 
f(P) #0. Let Y = V(f). The point P is not in Y, and thus W = V — V(f) is 
relatively open in V and contains P. 

Being relatively open in V, W is a quasi-affine variety. Since f vanishes on X, 
V(f) contains X = V — V. Thus the equality W = V — V(f) exhibits W asa 
relatively closed subset of A” — V(f), which Lemma 10.43 shows is isomorphic 
to an affine variety. Hence W itself is isomorphic to a quasi-affine variety that is 
closed in an affine variety. That is, W is isomorphic to an affine variety. 


Theorem 10.45. Let U and V be varieties, and let y +» @ be the function 
carrying dominant rational maps g : U — V to field mappings g : k(V) > k(U) 
respecting the operations by k and given by 


O(f) = (class of f|,, 0 Ge”), 


where f is in k(V), f is regular on F, (E, gz) is a pair in the class g, and 
F= Oe" (F). Then g +> @ is one-one onto the set of all field mappings from 
k(V) into k(U) respecting k. Furthermore, if P € U and Q € V are points, then 
the maximal ideal of @(O9(V)) is contained in the maximal ideal of O p (U) if and 
only if P is in the largest domain on which g is a morphism and has g(P) = Q. 


REMARK. The ring Op(U) is the k vector space sum of its maximal ideal 
and the constants, since evaluation at P is a well-defined multiplicative linear 
functional on Op(U), and a similar comment applies to Og(V). Whatever @ 
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does, it certainly carries 1 to 1, and hence if @ carries the maximal ideal of 
Oo(V) to the maximal ideal of Op(U), then it carries Og(V) to Op(U) also. 


PROOF. We begin by inverting gy +> g. Lemma 10.44 shows that any variety 
is covered by open subvarieties isomorphic to affine varieties, and the function 
fields of the variety and the subvarieties may all be identified with one another. 
Thus there is no loss in generality in assuming that V is an affine variety in 
A”. Let X1,..., Xp be the images in A(V) of X1,..., Xn, and suppose that a 
k algebra homomorphism y : k(V) > k(U) is given. Then y(X1),..., v(Xn) 
are rational functions on U, and we can find a nonempty open subset FE of U on 
which all these functions are regular. Since y is a homomorphism, y yields by 
restriction of the images a homomorphism y : A(V) > O(E). Moreover, this 
version of y is one-one on A(V) because y as a field mapping is one-one and 
because Proposition 10.34 shows that each member of O(E) extends in only one 
way to a member of k(U). Theorem 10.38 produces a morphism w : E > V 
such that w = y for this restricted version of y. Then the equivalence class ¢ of 
the pair (, E) is a rational map of U into V. 

To see that y is dominant, suppose on the contrary that w(E) is a proper 
closed subset of V. Then we can find a polynomial f that is 0 on y(£) but is not 
identically 0 on V. The image f of f in A(V) is nonzero. Since the restricted 
version of y is one-one, y(f) is nonzero in O(E). However, y(f) = W(f) = 
f ow, and the right side is 0 on E, contradiction. 

The construction is arranged in such a way that if we start from g, form @, 
and go through the construction to produce a rational map of U into V, then 
the resulting rational map is g. In the reverse direction, suppose that we start 
from y, produce gy, and then form @, and suppose that f in k(V) is in A(V). If 
E C U isas in the first paragraph of the proof, then a representative of ¢ is the 
pair (E, gg), where g¢ is the morphism such that (yg) = y. Then Yo(f) is the 
class of f o gg, which equals @(f) and hence y(f). In other words, y and % 
agree on A(V); being field mappings, they agree on k(V). This completes the 
proof of the first conclusion of the theorem. 

Now suppose that y is a dominant rational map from U to V and that ¢ is the 
corresponding field map of k(V) to k(U). Let P € U and Q € V be points, 
suppose that there is an open neighborhood E of P such that (E, gg) is in the 
equivalence class g, and suppose that g¢(P) = Q. Lemma 10.44 shows that 
there is a base of open neighborhoods of Q in V consisting of open sets that are 
isomorphic to affine varieties. Since yg is by assumption continuous, we can 
select any such open neighborhood and assume that g¢ carries E into it. Thus 
there is no loss of generality in assuming that V is isomorphic to an affine variety. 
We associate to yg the k algebra homomorphism (gg) : O(V) + O(E) given 
by (vz) (f) = f ove for f € O(V). This formula shows that the members f 
of O(V) that vanish at O are carried to members of O(£) that vanish at P and 
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that members of O(V) that do not vanish at QO go to members of O(£) that do 
not vanish at P. Therefore (yz) carries Og(V) into Op(E) = Op(U). 

Conversely suppose that the field map @ has the property that the maximal 
ideal of @(Og(V)) is contained in the maximal ideal of Op(U). Possibly by 
passing to an open subneighborhood from the outset, we may assume by Lemma 
10.44 that U and V are isomorphic to affine varieties. Dropping the isomorphism 
from the notation, we can write O(V) = A(V) = k[y1,..., ¥m] by Corollary 
10.25. Each (y;) is a rational function on U, which we can write as @(y;) = 
a;/bj with aj and bj in O(U) = A(U). The hypothesis on ¢ implies that 
P(Og(V)) © Op(U), hence that each @(y;) is regular at P. Thus we may take 
each denominator b; to have b;(P) 4 0. Choose an open neighborhood of P on 
which all b; are nonvanishing and an open subneighborhood E that is isomorphic 
to an affine variety. Since ¢ respects the field operations, it carries any polynomial 
in yj,..., Ym toa quotient c/d with c and d in O(E) and with d nowhere 0 on E. 
Therefore c/d is in (|p, Op (E) = O(E). That is, g carries O(V) into O(E). 
Since V is isomorphic to an affine variety, Corollary 10.25 and Theorem 10.38 
show that @ : O(V) — O(E) is given by the formula 


9(h)(u) = h(ve(u)) (*) 


for some morphism gg : E > V andallh € O(V) andu ¢€ E. The first part 
of the proof shows that the pair (E, gz) is in the equivalence class gy. Hence P 
is in the largest domain on which ¢ is a morphism. Arguing by contradiction, 
suppose that gg(P) = Q’ # Q. Choose by Proposition 10.36 a rational function 
h on V that is defined at both QO and Q’ and has h(Q) = 0 and h(Q"’) 4 0. Then 
@ carries Og(V) and its maximal ideal into Op(U) and its maximal ideal, and 
we obtain 0 = ¢(h)(P) = h(vge(P)) = h(Q’) £ O, contradiction. We therefore 
conclude that gz(P) = Q, and the proof of the second conclusion of the theorem 
is complete. 


Corollary 10.46. If U and V are varieties, then the following conditions are 
equivalent: 


(a) U and V are birationally equivalent, 

(b) k(U) and k(V) are isomorphic as k algebras, 

(c) there are nonempty open subsets EF of U and F of V such that EF and F 
are isomorphic as varieties. 


PROOF. The equivalence of (a) and (b) follows from Theorem 10.45 and the 
fact that composition of dominant rational maps corresponds to composition of 
homomorphisms of k algebras in the reverse order. 

Let us check that (c) implies (a). If (c) holds,let'g: E—> Fandy:F->E 
be morphisms that are inverse to each other. Then the equivalence classes of 
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(E, ) and (F, w) are rational maps from U to V and from V to U, respectively. 
The equivalence class of (EF, wo g) = (E, 1z) is the identity rational map on U, 
and the equivalence class of (F, go yr) = (F, 1) is the identity rational map on 
V. Hence the rational maps are inverses of one another. This proves (a). 
Finally let us check that (a) implies (c). If (a) holds, let g : U — V and 
w : V — U be rational maps that are inverse to each other. Let (£,, 9) 
and (F, w) be pairs representing g and w. Then a pair representing y o @ 
is (y~'(F1), W o @) because ¢ is a morphism on the open subset g~!(F) of Ey 
and w is a morphism on the open set F; containing y(y~!(F;)). Since y og is 
the identity on U as a rational map, yf o 9 is the identity morphism on g~!(F\). 
Put E = gy (Fi) C E,. Similarly g o w is the identity morphism on w(E1), 
and we put F = w—!(E,) C F;. Letussee that g(E) C F. Ife isin E, we are to 
exhibit some e, € E; with w(g(e)) in E,,and then g(e) will be in F = w~!(E}); 
for this purpose we can take e; = e, since y o ¢ is the identity morphism on E. 
Similarly y(F) C E. Thus ¢ and w exhibit E and F as isomorphic varieties. 
This proves (c). 


7. Zariski’s Theorem about Nonsingular Points 


Sections 1-6 have established the definitions and elementary properties of va- 
rieties, maps between varieties, and dimension. The present section concerns 
singularities, which are a fundamental topic of interest in algebraic geometry.!* 
This topic was introduced in Section VII.5 in a context that we now recognize as 
affine varieties. 

The definition of “nonsingular” was motivated by the classical Implicit Func- 
tion Theorem. Let k be an algebraically closed field, let the affine space in 
question be A”, and let p be the prime ideal such that the affine variety to study 
in A” is V(p). If {f;} is a finite set of generators of p and if P is in V(p), then P 
is said to be a nonsingular point of V(p) if rank Ee (P)] =n —dim V(p), and 
otherwise it is singular. Zariski’s Theorem, which was formulated as Theorem 
7.23 but only partially proved in Chapter VII, addressed this situation. In order 
to rephrase the theorem in our current notation, let A(V) be the affine coordinate 
ring of V, and let k(V) be the field of fractions of A(V), 1.e., the function field 
of V. Let mp be the maximal ideal of all members of A(V) vanishing at P, and 
let Op(V) be the local ring at P; this is the localization of A(V) with respect to 
the maximal ideal mp and is a subring of k(V). The maximal ideal of Op(V), 
consisting of all members of k(V) defined and vanishing at P, will be denoted 
by Mp. Theorem 7.23, translated into this notation, is as follows. 


'4The exposition in this section is based in part on Chapter I of Hartshorne’s book, Chapter III 
of Reid’s book, and Chapter II of Volume 1 of Shafarevich’s books. 
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Theorem 10.47 (Zariski’s Theorem, rephrased). In the above notation, 
dim,(Mp/M>) = dim,(mp/m>,) = dim V(p), 


and P is nonsingular if and only if equality holds. The set of nonsingular points 
of V (p) is nonempty and open. 


Toward the proof of this theorem, we showed in Section VII.5 for all P € V(p) 
that 


(a) dimy,(Mp/M3) = dim,(mp/m>), 


(b) dim, (mp /m) + rank [39-(P)] ue 


(c) Pisanonsingular point if and only if dim, (mp / m>) = dim V(p). 


In addition, we completed most of the proof in the special case that V(p) is an 
irreducible affine hypersurface by showing that 


(d) dim,,(m p/m) > dim V (p) for all P € V(p), 
(e) dim, (mp/m,) =dimV(p) for some P € V(p). 


Our goal in this section is to complete the proof of Zariski’s Theorem in the general 
case as stated by reducing (d) and (e) for the general case to what has already 
been proved for the special case that V (p) is an irreducible affine hypersurface. 
We need also to see in all cases that the set of nonsingular points is Zariski open. 


Before proceeding, let us mention the significance of Theorem 10.47. The 
definition above of nonsingular and singular points extends immediately to 
quasi-affine varieties, using the same defining polynomials, and the theorem is 
then applicable because the open set of nonsingular points in an affine variety 
meets any nonempty open subset of the variety. In the projective case we can pull 
matters back to affine space by means of one of the maps f; : A” — P”. In this 
way we obtain definitions of nonsingular and singular point for quasiprojective 
varieties, and the theorem remains valid.!> What is far from obvious with such 
a definition is that the decision nonsingular vs. singular for a point is unaffected 
by isomorphisms of varieties. On the other hand, the equivalent condition on 
Mp /M >, as stated in Zariski’s Theorem is manifestly unaffected by isomorphisms 
of varieties because of Proposition 10.42. 


'5Problems 13-16 at the end of the chapter show that the rank computation can alternatively be 
made directly with the homogeneous polynomials defining the projective variety in question. 
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Proposition 10.48. Any m-dimensional variety is birationally equivalent to 
an irreducible affine hypersurface H in A’"*!. 


PROOF. Let V be the variety in question. By definition of dim V, the function 
field k(V) is a finitely generated extension field of k of transcendence degree 
m over k. Since algebraically closed fields are perfect, Theorem 7.20 shows 
that k(V) is “separably generated” over k, and Theorem 7.18 shows as a con- 
sequence that k(V) has a “separating transcendence basis,” i.e., a transcendence 
basis {x1,...,Xm} such that k(V) is a finite separable algebraic extension of 
k(%1,...,Xm). By the Theorem of the Primitive Element, there exists an element 
Xm+1 Of k(V) such that k(V) = kQ@y,...,%m)[%m4i]. Let P(Xm+1) be the 
minimal polynomial of x4; over K(x1,...,Xm). Writing out the equation 
P(%m+1) = O and clearing fractions, we see that x,,,; satisfies a polynomial 
equation 


Ar (X1,- 665 Xm) Xa tee Fah, -,%m)Xm41 + a0(%1, ...,Xm) = 0 
in which the coefficient polynomials a;(X,,..., Xm) € k[X1,..., Xm] have no 
nontrivial common factor. In this case the polynomial f(X1,..., Xm+1) equal 


to 
ay (X1, Pret ena) Ry oh a a + a,(X), eee) Xm)Xm4i + ag(X1, ..+,Xm) 


is irreducible in k[X1, ..., Xm+41]. Thus the principal ideal (f) defines an irre- 
ducible affine hypersurface H = V(f) in A+! whose affine coordinate ring is 
k[X1,..., Xmai]/(f). The field of fractions k(#) is isomorphic to k(V), and 
H is birationally equivalent to V by the equivalence of (a) and (b) in Corollary 
10.46. 


Lemma 10.49. Every point P in V(p) has 0 < dimy(Mp/M?,) <n, and the 
set of points P in V(p) with dim, (Mp /M3) > r is a Zariski closed subset for 
each integer r. 


PROOF. The entries of the matrix [24] are polynomials, and the set of points 
J 


P of V(p) for which the matrix [#@ (P)] has rank < s is a Zariski closed subset, 
J 


a 
being the set on which all (s + 1)-by-(s + 1) minors of the matrix vanish. By 
display formula (b) above, the set of points P for which dim, (mp / m>) >n-—s 
is closed, and (a) therefore shows that the set with dim, (M p/ M3) >n—sis 


closed. 


PROOF OF THEOREM 10.47. Let m = dim V (p), and let a birational mapping 
of V(p) to an affine hypersurface H of At! be given. By the equivalence of (a) 


7. Zariski's Theorem about Nonsingular Points 603 


and (c) in Corollary 10.46, there exist nonempty open subsets EF of V(p) and F 
of H that are isomorphic as varieties, say by an isomorphism g : EF — F. Since 
m = dim V(p) = dim H, Proposition 10.11 shows that m = dim E = dim F 
also. For each integer r > 0, let 


{P € V(p) | dimy(Mp/Mp) <r}, 
{P € E | dimy(Mp/M3) <r}, 
= {Pe F CH |dim,(Mp/M5) <r}. 


S, 
ie 
U, 
Lemma 10.49 shows that 
S,, T;, U, are relatively open in V(p), E, F, respectively, for each r. (x) 
Application of Proposition 10.42 to g and g™! gives 
g(T,) = U, for allr > 0, (+) 
and the special case of Theorem 10.47 proved in Section VII.5 shows that 
Un FS and Un-1 = ©. (1) 
Combining (**) and (+) yields 
Tn ZS and Tn-1 = ©. (Tt) 
Since S, > T,., the first of these shows that 
Sin % @. (¢) 
If Sn-1 # @, then E 1 S,,-; 4 @ because any two nonempty open subsets of 


V (p) have nonempty intersection; but 7,,_; = EMS,,-; would then be nonempty, 
in contradiction to (++). Thus 


Sm—-1 = ©. (£4) 


In view of (a), (+) proves (e) for V(p), and (£4) proves (d) for V (p). Because of 
(££), Lemma 10.49 implies that S,, is Zariski open; thus the set of nonsingular 
points is open. 
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8. Classification Questions about Irreducible Curves 


Sections 1-7 give the fundamentals concerning (quasiprojective) varieties over 
the algebraically closed field k. The remainder of the chapter will address aspects 
of three problems: 


(i) What are all varieties, or in what senses can varieties be classified? 
(ii) To what extent can one make computations in the subject? 
(iii) What can be said when the algebraically closed field k is replaced by a 
general commutative ring with identity? 


Algebraic geometry is an enormous subject, going well beyond these problems. 
For example the investigation of the nature of singularities is in itself a large 
subject, with striking applications to topology and differential equations. The 
use of homological methods ties algebraic geometry closely to topology and to 
number theory, and these methods have bearing on the extent to which compact 
complex manifolds admit the structure of projective varieties. Algebraic geometry 
is an ingredient in the subject of invariant theory, which studies classical varieties 
using representation theory. It is an ingredient also in the subject of algebraic 
groups, which concerns varieties with a group structure in which multiplication 
and inversion are morphisms. 

The present section concerns the first of the three problems listed above, and 
we limit our discussion to irreducible curves, i.c., to varieties of dimension 1. 
We say that an irreducible curve is nonsingular if it is nonsingular at every 
point. We are going to show in this section that each birational equivalence 
class of irreducible curves over k contains a nonsingular projective curve and 
that any two nonsingular projective curves in the birational equivalence class are 
isomorphic as projective varieties.'° We also will get some information about 
how this nonsingular curve in the class is related to the other curves in the class. 
To a great extent the classification of irreducible curves will therefore have been 
reduced to the classification of the birational equivalence classes, which Corollary 
10.46 says is the same thing as a classification of the function fields in one variable 
over k. We will not have anything to say about classifying the function fields in 
one variable except to say that each class has a genus, according to Section IX.3, 
and that every nonnegative integer can arise as a genus, according to Example 3 
of genus in Section IX.3.!” 

Chapter IX already contains clues about where to begin. Section IX.1 men- 
tioned the relevance of Dedekind domains to the study, and Problems 5-11 at 
the end of that chapter attached a discrete valuation to each nonsingular point of 
any irreducible affine plane curve. The notions of Dedekind domains, discrete 


‘©The exposition in this section is based in part on Chapter 7 of Fulton’s book, Chapter I of 
Hartshorne’s book, Chapter II of Reid’s book, and Volume I by Zariski-Samuel. 
'’The subject of Teichmiiller theory in effect addresses this question when k = C. 
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valuations, and nonsingular points are very closely related, and we begin with 
some equivalences concerning them. Recall from Sections 2 and 4 that the affine 
coordinate ring A(C) of any irreducible affine curve C has Krull dimension 1. 
That is, the Noetherian domain A(C) has the property that every nonzero prime 
ideal is maximal. We have seen that the local ring Op(C) at any point is a 
localization of A(C), namely the localization of A(C) with respect to the maximal 
ideal mp of functions vanishing at P. Furthermore, the proper ideals of such a 
localization are exactly the sets S~'a with a equal to an ideal disjoint from the 
set-theoretic complement of mp in A(C). It follows that every nonzero prime 
ideal in Op(C) is maximal. This conclusion extends to the quasiprojective case 
as a consequence of Proposition 10.33. Zariski’s Theorem in Section 7 shows that 
nonsingularity of the point P of C can be detected from O p(C). Consequently 
the following proposition is relevant. 


Proposition 10.50. Let R be a Noetherian local ring that is an integral domain 
with the property that the only nonzero prime ideal is the maximal ideal. Let M 
be the unique maximal ideal of R, let K be the field of fractions of R, and let 
F = R/M be the quotient field. Under the assumption that M + 0 and therefore 
that R # K, the following conditions on R are equivalent: 


(a) R is integrally closed, 

(b) R is a Dedekind domain, 

(c) R is a principal ideal domain, 

(d) R is the valuation ring relative to some discrete valuation of K , 
(e) M is a principal ideal, 

(f) dime M/M? = 1. 


REMARKS. Consider (f). To see how M/M? becomes an F vector space in a 
natural way, letr + M be amember of F’, and let m+ M? be a member of M/M?. 
Then (r + M)(m + M?) = rm + M? is a well-defined scalar multiplication of 
F on M/M’, and M/M? becomes a vector space over F. Nakayama’s Lemma 
(Lemma 8.51 of Basic Algebra, restated in the present book on page xxiii) shows 
that an equality MN = N for a finitely generated R module N is possible only 
if N = 0; since M itself is a finitely generated R module, being an ideal in a 
Noetherian ring, and since M ¥ 0 by assumption, M* = M is not possible. 
Therefore dimr M/M 2> 1, 


PROOF. If (a) holds, then R satisfies the three conditions (Noetherian, integrally 
closed, every nonzero prime ideal maximal) in the definition of Dedekind domain. 
Thus (a) implies (b). A Dedekind domain with only finitely many maximal ideals 
is a principal ideal domain by Corollary 8 .62 of Basic Algebra, and thus (b) implies 
(c). A principal ideal domain is a unique factorization domain by Theorem 8.15 
of Basic Algebra, and thus (c) implies (a) by Proposition 8.41 of Basic Algebra. 
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To see that (a) through (c) are equivalent to (d), first suppose that (a) through 
(c) hold. Then every fractional ideal in K relative to R is of the form M* for 
some integer k. If x # 0 is in K, then the principal fractional ideal x R is of the 
form xR = MK for some k. Section VI.2 shows that the formula v(x) = k (with 
v(0) = oo) defines a discrete valuation on K, and the definition of v shows that 
the valuation ring of v is R. Hence (d) holds. Conversely if (d) holds, then R is 
a principal ideal domain by Proposition 6.2; thus (c) and necessarily (a) and (b) 
hold. 

Let us prove that (e) and (f) are equivalent. If (e) holds, then we can write 

M = (x) for some x in R. If m + M? is a given element of M/M7?, then m is 
of the form m = rz for some r in R. Hence (r + M)(a + M?) =ra + M? = 
m + M?, and dimp M/M? < 1. Since the remarks before the proof show that 
dimr M/M? > 1, (f) holds. 
If (f) holds, let {7 + M*} be an F basis of M/M?. If m € M is given, then 
m+ M? =(r+M)(a + M’) for somer € R. Therefore m = ra +m’ with 
m' € M*,and we see that (7) + M* = M. We shall apply Nakayama’s Lemma in 
the local ring R/(z:) with maximal ideal M/(z) and with module N = M/(z): 
Given m € M, we expand m =rx +m! withm' € M* asm =ra +), ,mjmj. 
Then the equality m+(7) = Pan, mm; in M/(z) shows thatm = )°; mi >; Ty 
hence that the coset m + (zr) lies in )); (m; + (1))(M/(z)). In other words, 
M/() = (M/(1))°. Nakayama’s Lemma shows that M/(z) = 0, and therefore 
M = (zs). Thus (e) holds. 

Finally let us prove that (c) and (e) are equivalent. If (c) holds, then M has to be 
principal, and hence (e) holds. Suppose that (e) holds, i.e., that M = (zr). Let J 
be a nonzero proper ideal in R. The ideal N = ()7°2, M k isa finitely generated R 
module because R is Noetherian, and ithas MN = N. By Nakayama’s Lemma, 
N = 0. Since J C M and since J $ 0, there exists a largest integer k > 1 such 
that 1 C M*. Choose y 4 Oin J with y in M‘ = (sr) but notin M**! = (r**!), 
Let us write y = az* for some a € R. Since y is not in M**! and since R is 
local, a is a unit in R. Hence ay = 7* isin J, and therefore M* = (2*) C J. 
Since we arranged that J C M*, we obtain J = M* = (*). Thus (c) holds. 


Corollary 10.51. Let C be an irreducible quasiprojective curve over k, and 
let k(C) be its function field. If P is a point of C, then the following conditions 
are equivalent: 

(a) P is anonsingular point, 

(b) Op(C) is the valuation ring of some discrete valuation of k(C) defined 
over k, 

(c) Op(C) is integrally closed. 


PROOF. Let Mp be the unique maximal ideal of Op(C). Zariski’s Theorem 
(Theorem 10.47) shows that (a) holds if and only if dim, Mp/ Mz = 1. The 
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corollary therefore follows from the equivalence of (f), (d), and (a) in Proposition 
10.50, along with the observation that any discrete valuation produced by (d) has 
to be 0 on k*. 


Corollary 10.52. If C is an irreducible affine curve over k with affine coordi- 
nate ring A(C), then the following conditions on C are equivalent: 
(a) A(C) is integrally closed, 
(b) Op(C) is integrally closed for each point P of the curve, 
(c) C is nonsingular. 


PRooF. If A(C) is integrally closed, then Corollary 8.48c of Basic Algebra 
shows that each localization Op(C) is integrally closed. Conversely if each 
Op(C) is integrally closed and if a member f of the function field k(C) is given 
that is a root of a monic polynomial with coefficients in A(C), then f is a root of 
the same polynomial with coefficients in O p(C) and is in O p(C) because O p(C) 
is integrally closed. Corollary 10.25 shows that A(C) = (|p Op(C). Therefore 
f lies in A(C), and A(C) is integrally closed. This proves that (a) and (b) are 
equivalent. The equivalence of (b) and (c) follows from Corollary 10.51. 


We turn our attention to constructing a nonsingular irreducible projective curve 
whose field of rational functions is a given function field K in one variable over 
k. If C is any irreducible quasiprojective curve with k(C) = K, then Corollary 
10.51 associates a discrete valuation of K over k to each nonsingular point of C. 
To get an idea what C must be like if it is to be nonsingular at every point, we 
now prove a theorem in the converse direction, associating a point of the curve 
to each discrete valuation of K over k. 


Theorem 10.53. Let C be an irreducible projective curve with function field 
k(C) equal to K, and let v be a discrete valuation of K defined over k. If R, is the 
valuation ring of v and p, is the valuation ideal, then there exists a unique point 
P on the curve for which the maximal ideal Mp of Op(C) has Mp C py. 


PROOF OF UNIQUENESS. Assume the contrary. If P and Q are distinct points 
with Mp C py, and Mg C py, then Proposition 10.36 constructs a function / in 
k(C) with h defined at P and Q,h(P) = 0, and h(Q) # 0. This function h 
is in Mp, and h — h(Q) is in Mg. The assumed inclusions of maximal ideals 
imply that v(h) > 1 and that v(h — h(Q)) = 1. On the other hand, h(Q) 4 0 
implies that v(h(Q)) = 0. Thus 0 = v(h(Q)) > min (v(A(Q) —h), v(h)) > 1, 
contradiction. 


PROOF OF EXISTENCE. It is shown in Problem 12 at the end of the chapter that 
any projective variety in P” is isomorphic to a projective variety V in some P” 
with n <r such that V is not contained in any subvariety { [X0,---,Xn] | xj = o} 
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with 0 < j <n. That being so, we may assume that C is a projective variety 
in P” and that CN B;(A”) # @ forO < j <n, where B; : A” — P" is the 
embedding defined after Proposition 10.18. Let A(C) = k[Xo,..-., XnJ/1(©) 
be the homogeneous coordinate ring of C, and for each j, let x; be the image of 
Xj in ACC), Since /(C) does not contain X;, x; is not the 0 element of A(C). 
Since X; and X; are homogeneous of the same degree, each function x;/x; is a 
well-defined member of the function field k(C). 

Let N = max;,; v(x;/x;). Possibly by renaming some coordinate xj, as xo, 
we may assume that v(x;,/xo) = N for some ip. Then we have v(x;/x0) = 
V(Xj,/X0) + U(X; /Xi,) = N — v(x;,/x;) = 0 for alli. Consequently each function 
x; /Xo lies in the subring R, of k(C). 

Theorem 10.20 and Corollary 10.22 show that Co = By lc ) is an irre- 
ducible affine curve and that its prime ideal is J(Co) = Bj(1(C)). Conse- 
quently the substitution homomorphism ) : k[Xo,..., Xn] > k[X1,..., Xn] 
descends to a homomorphism of A(C) = k[Xo, ...,Xn]/I(C) onto A(Co) = 
k[X1,..., Xn]/7(Co) that carries xo in A(C) to | and carries the members 
X1,...,X, Of A(C) to the generators of A(Co). The members x;/x9 of k(C) 
therefore get identified with the generators of A(Co), and we conclude that 
A(Co) © Ry. 

Define q = p, M A(Co). This is a prime ideal of A(Co), and it pulls back 
under the quotient homomorphism k[X,,..., X,] — A(Co) to a prime ideal 
containing /(Co). Then V(q) is an affine subvariety of Co. Since dimCo = 1, 
there are only two possibilities. One is that dim V (q) = 1, in which case V(q) = 
Co, q = I(Co), and q = 0. The other is that dim V(q) = 0, in which case 
V(q) = {P} for some point P that necessarily lies on Co. In the first case, v 
is 0 on every nonzero member of A(C) and hence is 0 on K(C)*, contradiction. 
Thus we are in the second case. Then @ is maximal in k[X,,...,X,], q is 
maximal in A(Co), q is the ideal mp of all members of A(Co) vanishing at P, 
and A(Co)/q = k. If S denotes the set-theoretic complement of q in A(Co), then 
no member of S$ can be in p, because then g + kl = A(Co) would be in p,, 
contradiction. Thus v(s) = 0 for alls ¢ S,and Mp = Simp ¢ Py. 


Corollary 10.54. If g is a rational map from an irreducible curve C’ to an 
irreducible projective curve C, then the largest domain on which ¢ is a morphism 
contains every nonsingular point of C’. If C’ is nonsingular, then g is a morphism 
from C’ into C. 


PROOF. If g is not dominant, then Problem 6 at the end of the chapter shows that 
gy is constant. Certainly the largest domain on which a constant g is a morphism 
is C’. 

Thus suppose that g is dominant. Using the notation introduced early in 
Section 6, let g@ : k(C) — k(C’) be the associated field map of function fields. 
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Since k(C) and k(C’) both have transcendence degree | over k and since k(C) is 
finitely generated as a field over k, the field k(C’) is a finite algebraic extension 
of the field G(k(C)). If v is any discrete valuation of k(C’), then it follows from 
the finiteness of this extension that v cannot be identically 0 on @(k(C))*; in fact, 
if it were identically 0, then the expansion x = ie cj;x; of a general element 
x of k(C’) in terms of a vector-space basis {x1,..., Xm} of k(C’) over G(k(C)) 
would yield the inequality v’(x) > min; v(x;), which cannot be true for all x. 
Meanwhile, if P is a nonsingular point of C’, then Corollary 10.51 shows that 
Op(C’) is the valuation ring R, for some valuation v of k(C’) overk. The maximal 
ideal Mp of O p(C’) equals the valuation ideal p, of v. Since the restriction of v to 
(k(C))~ is not identically 0, the restriction comes from some positive multiple e 
of a discrete valuation on @(k(C)). Let vp be the corresponding discrete valuation 
of k(C); this is given by vo(f) = e!v(@(f)). Let Ro be its valuation ring and 
po be its valuation ideal in k(C); the latter is given by po = @~'(p,). Theorem 
10.53 shows that there exists a unique point Q on the curve C such that the 
maximal ideal Mg of Og(C) is contained in po. That is, Mg C po = @ | (py). 
Application of ¢ gives @(Mo) C GP |(py) C py = Mp. Theorem 10.45 shows 
that consequently P is in the largest domain on which ¢ is a morphism and that 


p(P) = Q. 


Corollary 10.55. If two nonsingular irreducible projective curves are bira- 
tionally equivalent, then they are isomorphic as varieties. 


PROOF. This follows by applying Corollary 10.54 twice. 


Corollary 10.56. If C is a nonsingular irreducible projective curve with 
function field K = k(C), then the points of C are in one-one correspondence 
with the discrete valuations of K defined over k. 


PROOF. This is the correspondence given in one direction by Corollary 10.51 
and in the reverse direction by Theorem 10.53. 


Corollary 10.56 has a remarkable conclusion, but the corollary assumes the 
existence of a nonsingular projective curve, which we have not yet proved. In more 
detail we now know that a nonsingular point P of any irreducible projective curve 
C picks out a unique discrete valuation v of the function field K = k(C), namely 
the one whose valuation ring is given by R, = Op(C), and that conversely when 
C is projective, any discrete valuation v’ defined over k picks out a certain point P’ 
of C with the property that Op:(C) C Ry. If P is nonsingular and we go through 
the first step and then the second, using v’ = v, we obtain Op (C) C Op(C). 
Proposition 10.36 shows that P’ = P, and hence the second process inverts the 
first. That is what Corollary 10.56 says. Also, we know from Theorem 10.47 that 
many discrete valuations are involved in this process, since the set of nonsingular 
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points of a variety is Zariski open. What we do not know is that any given discrete 
valuation over k ever yields a nonsingular point for any curve with the function 
field K. This missing piece of information will be supplied in Corollary 10.58 
below. To prove Corollary 10.58, we shall make use of the following theorem, 
which we need only in the case that the field k is our algebraically closed field k. 
We postpone the proof of the theorem for a moment, and when we give the proof, 
we shall give it only for the case that the field k in the statement is algebraically 
closed. 


Theorem 10.57. Let k be a field, let R = k[x1,..., x,] be a finitely generated 
integral domain over k, let K be the field of fractions of R, and let L be a finite 
algebraic extension of K. Then the integral closure T of R in L is a finitely 
generated R module. 


Corollary 10.58. Let C be an irreducible projective curve with function field 
K = k(C), let P be a point of C, and let Mp be the maximal ideal of Op(C). 
Then there exists a discrete valuation v of K defined over k whose valuation ideal 
py has Mp C py. 


REMARKS. This result is a supplement to Theorem 10.53. It says that the map 
of that theorem, carrying discrete valuations of K defined over k to points of C, 
is onto. 


PROOF. Without loss of generality, we may assume that C is affine. Let mp be 
the maximal ideal in the affine coordinate ring A(C) consisting of all functions 
vanishing at P, and let S be the set-theoretic complement of mp in A(C), so 
that Mp = S~'mp. Evaluation at P is a linear functional on A(C) with kernel 
mp, and therefore A(C) = mp + k1. In other words, mp and any element of S 
together generate A(C) as a k vector space. 

If T denotes the integral closure of A(C) in K, then Theorem 10.57 implies that 
T is Noetherian, and Proposition 8.45 of Basic Algebra shows that every nonzero 
prime ideal of T is maximal. Hence T is a Dedekind domain. Proposition 
8.53 of Basic Algebra shows that there exists a maximal ideal q of T such that 
mp = A(C) NM q. Since T is a Dedekind domain, q is contained in the valuation 
ideal p, of a unique discrete valuation v of K, and T is contained in the valuation 
ring T, of v. Thusmp C p,,and S C T implies that v(s) > Oforalls € S. Onthe 
other hand, | lies in mp + ks for any s in S, and hence 0 = v(1) > min(1, v(s)). 
Therefore v(s) = 0 forall s € S,and Mp = S~'mp Cp,. 


Corollary 10.59. If KK is a function field in one variable over k and if v is a 
discrete valuation of K defined over k with valuation ring R,, then there exists 
an irreducible nonsingular affine curve C over k with function field IK and with a 
point P such that Op(C) = R,. 
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PROOF. Choose an element x of KK such that v(x) > 0. Define R = k[x]. 
Since v(x) 4 0, x is transcendental over k, and K is a finite algebraic extension 
of the field of fractions k(x) of R. Corollary 7.14 shows that the integral closure 
T of R in Kis a Dedekind domain, and Theorem 10.57 shows that T is a finitely 
generated R module. Thus we can write T as T = k[x,,...,x,] with x; = x. 
The substitution homomorphism with X; +> x; for all j carries k[X,,..., Xn] 
onto T and has a prime ideal p as kernel, since T is an integral domain. Thus 
V (p) is an affine variety with T as its affine coordinate ring. The dimension of 
V (p) is the transcendence degree of K over k, which is 1 by assumption. Thus 
C = V(p) is an irreducible curve. Since T is integrally closed by construction, 
Corollary 10.52 shows that C is nonsingular. 

Let Ry C K be the valuation ring of v, and let p, be the valuation ideal. The 
inequality v(x) > 0 shows that v is > 0 on R = k[x], and Proposition 6.7 says 
that v is consequently > 0 on the integral closure T of R in K. In other words, T 
is contained in R,. Since T is a Dedekind domain and K is its field of fractions, 
Theorem 6.5 shows that q = p, 9 T is a nonzero prime (= maximal) ideal of T 
and that the discrete valuation vg of IK over k determined by q coincides with v. 
The maximal ideals of the affine coordinate ring of an affine variety correspond 
to the points of the variety by Proposition 10.23, and thus there exists a point P 
of C such that q is the maximal ideal of T consisting of all functions vanishing 
at P. The localization of T with respect to q is Op(C) by definition and is R, by 
Proposition 6.4. Therefore Op(C) = Ry. 


Corollary 10.60. Let C be the irreducible nonsingular affine curve constructed 
in Corollary 10.59 and having function field K = k(C), and regard C as a 
subvariety of its projective closure C. Then there are only finitely many discrete 
valuations v’ of K defined over k such that the unique point P of C with Mp C py, 
where Mp is the maximal ideal of Op(C) and p, is the valuation ideal of v’, lies 
outside C. 


PROOF. We go over the argument in Corollary 10.59 with the same element 
x and with any discrete valuation v’ defined over k such that v’(x) > 0. This 
inequality implies that v’ is > 0 on k[x], and Proposition 6.7 then shows that v’ is 
> O0onT = A(C). Thus A(C) is contained in the valuation ring Ry of v’. Define 
q = Py M A(C). Arguing as in the existence proof for Theorem 10.53, we find 
that q equals the ideal mp of all members of A(C) vanishing at a certain point 
P of C, and that proof then shows that Mp C p,. By uniqueness in Theorem 
10.53, this P is the one and only point produced by that theorem. 

In other words, the only discrete valuations v’ of K defined over k for which 
the point P lies outside C are those with v’(x) < 0. Corollary 6.10 shows that 
there are only finitely many of these. 
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We come to the proof of Theorem 10.57, but only under the assumption that k is algebraically 
closed. The proof is rather technical, and the reader is encouraged to skip it on first reading. To 
underscore this point, the proof appears in small print. We need two lemmas. 


Lemma 10.61. Let R be a Noetherian integrally closed domain with field of fractions F’, let K be 
a finite separable extension of F’, and let T be the integral closure of R in K. Then T is Noetherian 
and is finitely generated as an R module. 


ProorF. In effect, this result was proved in Basic Algebra. In more detail: With the above 
assumptions and also the assumption that every nonzero prime ideal of R is maximal (i.e., that R 
is a Dedekind domain), the proof of Theorem 8.54 of Basic Algebra showed that T is a Dedekind 
domain. The hard part of that proof appeared in Section [X.15; it showed from the separability that 
T is finitely generated as an R module, and it did not make use of the assumption that every nonzero 
prime ideal of R is maximal. Since T is finitely generated and R is Noetherian, every R submodule 
of T is a finitely generated R module, by Proposition 8.34 of Basic Algebra. In particular, every 
ideal of T is finitely generated as an R module and therefore is finitely generated as a T module. 
Consequently T is Noetherian. 


Lemma 10.62 (Noether Normalization Lemma). Let k be an infinite field, let R = k[x1,..., Xn] 
be a finitely generated integral domain over k, and let K = k(x1,..., X;) be the field of fractions of 
k. Then for a suitable d with 0 < d < n, there exist d linear combinations yj, ..., yg of X1,...,%n 
with coefficients in k such that y,,..., yg are algebraically independent over k and such that every 
element of R is integral over k[y1,..., ya]. If K is separably generated over k, then the y; may be 
chosen in such a way that K is a separable extension of k(y1,..., ya). 


REMARKS. It is immediate from the conclusion that d is the transcendence degree of K over k. 
The lemma is a result about the extension of rings that improves upon Theorem 7.7 for fields; the 
latter says that every field extension can be accomplished by a transcendental extension followed by 
an algebraic extension. The present lemma says that the passage from a field to a finitely generated 
integral domain can be accomplished by a full polynomial extension followed by an extension in 
which each generator is not merely algebraic but actually is a root of a monic polynomial with 
coefficients in the full polynomial ring. 


PrRooF. Let J be the kernel of the quotient homomorphism k[X1,..., Xn] > k[x1,..-,Xnl- 
The core of the proof involves a single nonzero f in J. The idea is to replace X1,..., Xn—1 by new 
indeterminates X 7 sii Xi _y to make the equation f(x, ...,X,) = 0 become a monic polynomial 
equation satisfied by x, over R’ = k[X},..., Xj,_,]. With cy, ..., Cn—1 equal to members of k to be 
specified later, define xj = Xj —CjXn for 1 < j <n—1. The equation f(x, ...,X,) = 0 becomes 


fy tein, .., x) _y + n-1%n, Xn) = 0. (*) 
For a suitable choice of c1,..., Cn—1, we shall show in a moment that 
the polynomial F(X} +c1Xn,..-,Xh-y +cn—-1Xn, Xn) is monic in Xy, (+) 


after multiplication by a member of k”. 

Assuming (*), let us see how the first conclusion of the lemma follows by induction on n. For 
n = 1, there are two cases. One case is that K is a simple algebraic extension field of k, and then 
every element of the extension field R = K is a root of its minimal polynomial over k. This is the 
case d = 0. The other case is that K is a simple transcendental extension, and then we can take 
y, = x1. This is the cased = 1. 

For the inductive step, assume the first conclusion of the lemma for n — 1 > 1, d being an integer 
with 0 < d <n—1. If J = 0, there is nothing to prove, since x1,..., Xx, are then algebraically 
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independent and the lemma follows with d = n and with yj = x; forl < j <n. IfI 40, fix f £0 


in J, and choose cj,..., Cy,—1 in k to make (**) hold. Then («) shows that x, is a root of a monic 
polynomial with dostiiciontéd in R’ = k[x},...,x/_,]. By the inductive hypothesis we can choose 
members y},..., y/, of R’ withO < d <n—1 such that y},..., y/, are algebraically independent 
over k and such that every element of R’ is integral over k[y;,..., 9]. By transitivity of integral 
dependence, every element of R’[x;] is integral over k[y;,..., y/,]. Since the definition of x in 
terms of x; shows that R’[xn] = k[x},...,%),_4. Xn] = k[u1,...,%n—1, Xn] = R, every element of 
R is integral over k[ yt ere yi. This completes the induction, and the first sentence of conclusions 


of the lemma is proved except for (**«). 

To prove (**), letr = deg f, and write f = h,; + g with h; nonzero and homogeneous of degree 
r and with deg g <r — 1 (org = 0). Then 

f(B1,.--,Xn) = F(X, +01 Xn,-.., X4_y + cn-1 Xn, Xn) 
=h,(e1Xn,..., Cn—-1Xn) + (terms involving 1, Xn, x2, aes x’) 
=h,(c1,..-, Cn—1, 1)Xj, + (terms involving 1, Xn, x?, Peres Xi), 

Thus (+) is proved if cy, ..., C,—1 can be chosen with the scalar h;(c1,..., Cn—1, 1) notO. Here the 
fact that h, is nonzero and homogeneous implies that h,(X1,..., X;—1, 1) is not the 0 polynomial 
in k[X1,..., Xp,-1]. Since k is an infinite field, Corollary 4.32 of Basic Algebra shows that the 
evaluation mapping of k[X1, ..., Xn—1] into the algebra of functions from k"—! into k is one-one, 
and therefore there exist cj, ..., Cn—1 with h;(c1,..., Cn—1, 1) # 0. This proves (**). 

We are left with proving that if K is separably generated over k, then the y; may be chosen with 
K separable over k(y1,..., yg). We proceed as above but with an amended version of (>) that we 
mention in a moment. In the induction the extra hypothesis for n = 1 is that either x; is separable 
algebraic over k or x; is transcendental, and in both cases K is a separable extension of k(1). 
For the inductive step when J 4 0, Theorem 7.18 shows that {x1,..., xn} contains a separating 
transcendence basis; possibly by renumbering the variables, we may assume that this transcendence 
basis is a subset of {x1,..., X,—1}. In particular, x, is separable algebraic over k(x1,..., Xn—1). For 
the polynomial f, we start from the minimal polynomial of x, over k(x1,..., Xn—1), next multiply 
by acommon denominator to get all coefficients of powers of X,, to be in k[x1,..., X,—1], and then 
replace the occurrences of x1,...,Xn—1 by X1,..., Xn—1. The result is f. We choose Mp aoe yn 
as above, and the inductive hypothesis shows that k(x}, ..., x) ) is separable over k(y;,..., 9): 
If we can show that x, is separable over K(x}, tbr x 1)» then we will have proved that K is a 
separable extension of k(yj, ..., y/,) because of the transitivity of separability. So the induction will 
be complete. 

To get that x, is separable over k(x}, ..., xj), it is enough to prove that we can arrange for 


Xn tobeasimple root of f(x; +c1Xn.---.X,-) #Cn—-1 Xn, Xn) (+) 
in addition to (**). Indeed, then x, is a root of a separable polynomial over K(x} ssh takicy xy 1) and 
hence is a separable element over k(x}, mee aes is The condition (+) is the same as the condition 


that the derivative of (+) with respect to x, n,» When evaluated at x,, be nonzero. Thus we want to 
arrange that 


Sn Ly ees nats Xn) + C1 SLO, 0 Mts Xn) He + Cn fn-1 1 + Xn-1, Xn) FO, (F) 
where the subscripts on f indicate first partial derivatives in the indicated variables. The left side 
of (+7) is the sum of a constant and a linear functional on the vector space of all (cj, ...,Cj—1) in 
k"-!_ The constant term is Fn(%1,-++,Xn—1, Xn), which is nonzero because x, is separable over 
k(x1,..., X,—1) and is therefore a simple root of its minimal polynomial over k(x1,..., Xn— v. Thus 
the left side of (++) is the value of a nonzero polynomial p(X1,..., Xn-1) = an 4 Sey 1 UX. 
at (cj,...,Cn—1). Consequently (*«*) and (ff) will hold Siculbineonely if we choose a soint 
(C1, ..+,Cn—1) in k"—! at which the nonzero polynomial p(X1,..., Xn-1)hr(X1,..., Xn—1, 1) 1s 
not zero. 
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PROOF OF THEOREM 10.57 UNDER THE ASSUMPTION THAT k IS ALGEBRAICALLY CLOSED. 
The first step is to reduce to the case that L = K,i., that the field of fractions of R coincides with 
L. To do so, choose a vector-space basis {z1,..., Z-} of L over K consisting of elements integral 
over R; this is possible by Proposition 8.42 of Basic Algebra. Put S = R[z1,...,z,-]. This isa 
finitely generated integral domain over k, all of its elements are integral over k, and it has L as field 
of fractions. The integral closure of R in L equals the integral closure of S in L. 

Thus we may assume that R = k[x1,..., X,] is an integral domain with field of fractions K and 
that we are to prove that the integral closure T of R in K is a finitely generated R module. Let d be 
the transcendence degree of K over k. Since algebraically closed fields are perfect, Theorem 7.20 
shows that K is separably generated over k. Lemma 10.62 is therefore applicable, and it produces 


d linear combinations y,..., ya of x1,...,X, over k such that the subring S = k[y1,..., ya] of 
R is a full polynomial ring, every element of R is integral over S, and K is a separable extension 
of the field k(y1,..., ya). Since every element of T is integral over R, the transitivity of integral 


dependence implies that every element of T is integral over S. Therefore T is the integral closure 
of S in K. Being a full polynomial ring, S is Noetherian and is a unique factorization domain; the 
latter property implies that S is integrally closed, according to Proposition 8.41 of Basic Algebra. 
Taking S to be the Noetherian integrally closed domain in Lemma 10.61, we see that T is finitely 
generated as an S module. Since S C R, T is certainly finitely generated as an R module. 


Now we come to the main theorem of this section. 


Theorem 10.63. Every birational equivalence class of irreducible projective 
curves contains a nonsingular such curve, and this curve is unique within the 
equivalence class up to isomorphism of varieties. Any irreducible nonsingular 
quasiprojective curve is isomorphic to an open subvariety of some irreducible 
nonsingular projective curve. 


REMARKS. The new content of the theorem is the existence of the nonsingu- 
lar projective curve. The uniqueness is immediate from Corollary 10.55. The 
statement about nonsingular quasiprojective curves is a formality: Such a curve 
Co is birational to the nonsingular projective curve C produced by the theorem 
and also to the projective closure Co of Co. The birational maps from Co into 
C and from C into Co yield morphisms from Co into C and from C into Co by 
Corollary 10.54; sorting out these morphisms shows that Co is isomorphic to an 
open subvariety of C. 


The idea for proving the existence of the projective curve in the theorem is to 
start with any function field K in one variable over k, take any discrete valuation 
v of K defined over k (these exist as a consequence of Section VI.2), and use 
Corollary 10.59 to obtain some irreducible nonsingular affine curve having K as 
function field and having its local ring at some point equal to the valuation ring of v. 
Corollary 10.60 shows that except for finitely many discrete valuations, we have 
associated a nonsingular point on some irreducible affine curve in the birational 
equivalence class to each discrete valuation of K defined over k. Applying 
Corollary 10.59 to each of these exceptional discrete valuations, we end up with a 
finite set of irreducible nonsingular affine curves such that each discrete valuation 
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of K over k corresponds to some point of at least one of the curves. We shall 
glue together these irreducible nonsingular affine curves in a suitable fashion to 
obtain the desired irreducible nonsingular projective curve. 

The proof makes use of the fact that the product of two projective varieties 
is a projective variety and that morphisms behave as one might expect. Let us 
postpone the details of establishing a rigorous theory of product varieties, going 
right to the proof of Theorem 10.63. 


PROOF OF THEOREM 10.63. Let K be the given function field, and let C,,..., Cin 
be the irreducible nonsingular affine curves described two paragraphs before this 
paragraph. In each case the function field of the curve is isomorphic to K by 
some fixed isomorphism, but we shall treat this fixed isomorphism as if it were 
the identity in order to avoid unnecessary complications in the notation. Let Vx 
be the set of discrete valuations of K defined over k. For v € Vx, we write 
R, C K for the valuation ring of v and p, for the valuation ideal of v. 

For definiteness let C; be an affine variety in A‘, and let C,..., C, be the 
respective projective closures of Cy,...,Cm in PX’, For any point P in Cj, let 
Mp be the maximal ideal of the local ring Op (Cj). 

Theorem 10.53 gives us for each j a well-defined function y; : Vk > C; , and 
Corollary 10.58 says that y; is onto C;. The defining property of y;(v) is that 
My,(v) © Pv, and it follows that Oy,(v) (Cj) C R,. Corollary 10.51 shows that the 
inverse image under y; of any point in C; is a singleton set, and Corollary 10.60 
shows that the inverse image of any point of the complementary set C; — C; 
is a finite set. Let F be the finite subset F = ie yj (Cj - Cj) of Vu. 
For v ¢ F, y;(v) is a nonsingular point of C;, and Corollary 10.51 shows that 
Oy,(v)(Cj) = Rv. Hence also My) = py. The construction of the curves 
Ci, ..., Cm was arranged in such a way that 


each v € Vx has y;(v) in C; for some j. () 


Let Uj; be the open set of C; given by U; = y;(Vx — F). The curves C; are 
birationally equivalent because they all have K as function field, and Corollary 
10.54 shows that the largest domain on which the birational map from C; to C; 
is a morphism includes all the nonsingular points of C;. In particular, it contains 
U; = yj(Vx — F). If g; is the morphism from U; into C,, then Proposition 
10.42 shows that g; induces a homomorphism Gj. p: Opp) (C1) — Op(C)) for 
P € Uj. By assumption, the isomorphism ¢; : k(C;) > k(C;) is normalized to 
be the deny Since @; is the field mapping corresponding to the birational map 
Yj, 9; is an extension of g* ». Thus @? p is the identity under our identifications: 
Og, P)(C1) = Op(C;) for P € Uj. Let P = y;(v) with v in Vx — F, and let 
gj(P) = yi(v’) with v’ in Vx. Then R, = Oy, (Cj) — Oop) (C1) C Ry, and 
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it follows that v’ = v. In particular, v’ is in Vk — F, and y\(v) = gj (yj(v)). 
Hence 


gov: Vek-FoU, is independent of /, 


and gj :U; > U, is an isomorphism. 


The product W = C; x --+ x Cm is an m-dimensional closed subvariety of 
Ph x... x Pk», which in turn is a projective variety in P™ for a suitably large N. 
For 1 < j <m, let 7; : W > C; be the j™ projection map; this is a morphism. 
The set U; x --- x Uj» is an open subvariety of W, and the “diagonal” 


A = {8(P) = (P, 9) '(P),-.-,@,' (P))| P € U1} 


of U, x --+ X U, is an irreducible curve isomorphic to U;. The closure C = A 
is an irreducible projective curve. It is a closed subvariety of W, and it has A as 
an open subvariety. The curve A may be identified with U, via the projection 7, 
and we may therefore identify the function field of A, which is the same as the 
function field of C, with K. 

We shall show that C is nonsingular. For each j, the restriction 2; : C > C;is 
a morphism, and the image contains all points 7; (6(P)) = g;" (P) with P € U,. 
Hence it contains U;, which is an open subset of C}. In other words, 2; : C > 
Cj is a dominant morphism. For P € Uj, we have 7;(6(P)) = gy; \(P). If 
OQ = 4(P), this says that 7;)(Q) = y,'8-'(Q), from which it follows that 
6 0 gj is a two-sided inverse of 7; on A. Consequently the dominant morphism 
mj; : C — C; is abirational map. Let (V;, ;) be a pair in the class of the rational 
map Ty we may assume that V; is the largest domain in C; on which ie isa 
morphism. 

Let P be any point of C, and let Mp be the maximal ideal of Op(C). Corollary 
10.58 shows that there is amember v of Vx suchthat Mp C p,. Choose j = j(P) 
with 1 < j < m such that y;(v) is in C;. Since every point of C; is a nonsingular 
point by construction, Corollary 10.54 shows that every point of C; lies in the 
domain V; on which yy; is defined as a morphism inverting 7;. Consequently the 
open subvariety a (Cj) of C is isomorphic to the nonsingular irreducible affine 
curve C;, and the point P of C has an open neighborhood of nonsingular points. 
Since P is arbitrary, C is nonsingular. 


The remainder of this section develops a small theory of products of varieties 
in projective spaces. Most of the proofs are left to the problems at the end of the 
chapter. It is enough to handle the product of two varieties because general finite 
products of varieties can then be treated by induction. 

We begin with the product of two projective spaces. Letm > 1 andn > 1 be 
integers, and put N = (m+ 1)(7+1)-—1=mn+m-+n. We shall exhibit 
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P” x P” as a projective variety in P’. To do so, we coordinatize P”, P”, and PY 
by using x;, yj, and w;; forO <i <mand0O < j <n. Then 


Re Sp cceual ys PRS ir vere dal} 


and 
N 
P% = {[woo, Wor, +--+ Wmn—1s Wmnlf. 


The Segre embedding is the function 


Clipse mls Dish Val) = OVO OP vse VEST eel 


Le., wij = x;y;. Define a C k[Woo,..., Winn] to be the homogeneous ideal 
generated by all Wi; Wis — Wis Wx;. Problems 17-19 at the end of the chapter 
show that o is well defined and one-one, that the image of o is V(a), and that 
V (a) is irreducible. Thus the Segre embedding exhibits P” x P” as a projective 
variety in P’. This variety is known as a Segre variety.!* 

Let U C P” and V C P" be projective algebraic sets. Then the Segre 
embedding o carries U x V toa subset of P’ , and we wish to see that 0 (U x V) is 
a projective algebraic set in IP. Let us use the abbreviation X = (Xo, ..., Xm)- 
If wa = (ao,...,@m) is an (m + 1)-tuple of nonnegative integers, we define 
|a| = ag+---+a,, and X* = X9°--- X2", We define Y, B,|B|,and Y® similarly. 
Any monomial X*Y* with |w| = d and |f| = e is said to be bihomogeneous of 
bidegree (d, e). A bihomogeneous polynomial of bidegree (d, e) is any linear 
combination of bihomogeneous monomials of bidegree (d, e). 

The first observation is that any projective algebraic set S in P” can be described 
as the locus of common zeros of a vector space of homogeneous polynomials in 
X of a fixed degree. In fact, we know that S is given by the locus of common 
zeros of a finite set of homogeneous polynomials F\(X),..., F-(X) of various 
degrees d;,...,d,. Let us say that d = max; d;. The point is that S is given 
by the locus of common zeros of a finite set of homogeneous polynomials all of 
degree d. The reason is that the locus of common zeros of F(X) is the same 


as the locus of common zeros of pe Fy(X),..., xe4 F(X). The assertion 
about describing S follows. 

Now let U C P” be the locus of common zeros of homogeneous polynomials 
F\(X),..., F-(X) all of degree d, and let V C P” be the locus of common zeros 
of homogeneous polynomials G,(Y),...,G,(Y) all of degree e. Then U x V 
is the locus of common zeros of the bihomogeneous polynomials F,(X)G,(Y), 
all of bidegree (d, e). These cannot immediately be expressed in terms of the 
polynomials W;; of the Segre embedding. However, if we use the same trick 
again, we can substitute the W;;’s. Specifically suppose that d < e. Replace 


'81¢ we form the (m + 1)-by-(n + 1) matrix whose (i, pm entry is W;;, then an equivalent 
description of the Segre variety is as the locus of common zeros of all 2-by-2 minors of this matrix. 
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F\(X),..., F-(X) by a family of r(m + 1) polynomials F/(X),..., From ty (X) 
homogeneous of degree e. Then the polynomials F/(X)G,(Y) are bihomo- 
geneous of bidegree (e,e). When such a polynomial is expanded as a linear 
combination of monomials, each monomial has e factors from among Xo, ..., Xm 
and e factors from among Yo, ..., Y,. We can pair the factors in whatever fashion 
we want and replace X;Y; by W;;. In this way our system of bihomogeneous 
polynomials can be rewritten as a system of polynomials H,,(W), together with 
the convention that W;; = X;¥Y;. Then o(U x V) is the locus of common zeros in 
P’ of the polynomials H,,(W) and the defining polynomials of the Segre variety. 

Conversely if we have a projective algebraic set in P’, then its intersection 
with the Segre variety can be described as the locus of common zeros in P” x P” 
of a family of bihomogeneous polynomials in (X, Y). We have only to take the 
defining homogeneous polynomials H(W) and substitute the definition W;; = 
X;¥; for W;;. If H(W) is homogeneous of degree e, then the result of the 
substitution is a polynomial bihomogeneous of bidegree (e, e). 

Problems 20-21 at the end of the chapter show that if U and V are irreducible 
closed sets in P” and P”, respectively, then o(U x V) is irreducible in P’. Thus 
we can meaningfully speak of projective varieties in P” x P”. The same pair of 
problems addresses what happens for quasiprojective varieties, showing that o of 
any relatively open subset of a projective variety in P” x P” is a quasiprojective 
variety in P’, 

Now that the notion of variety is meaningful in P” x P”, with an interpretation 
in P’, we can similarly translate definitions and facts about morphisms to make 
them apply in P” x P”. In particular, the projection of a variety to either factor 
P” or P” is a morphism on the variety. If U is a quasiprojective variety and if 
gy, : U — P” and g : U — P" are isomorphisms of U onto quasiprojective 
varieties in P” and P”, then the diagonal A = {(gi(u), g2(u)) | u € US isa 
quasiprojective variety in P” x P”, and the pair (g1, ¢2) is an isomorphism of 
varieties. These matters are discussed in Problem 22 at the end of the chapter. 


9. Affine Algebraic Sets for Monomial Ideals 


Sections 9-12 in part address aspects of the question of how much one can 
make explicit computations with affine and projective varieties. As a general 
tule, the tool for such computations is the theory of Grébner bases, which were 
introduced in Sections VII.7—VIII.10. The topic is an active area of continuing 
research.!? One can think of immediate problems — suchas finding the dimension 
of an algebraic set, determining the radical of an ideal when the ideal is given, 


!°The book edited by Buchberger and Winkler contains a number of expository “tutorials” that 
give an idea of the breadth of applications of the theory. The book contains also a certain number of 
research papers. 
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and deciding whether an ideal is prime. We shall concentrate on just one such 
problem, that of finding the dimension.”° 

Part of the abstract theory in this case dates back to Hilbert, but in combination 
with the theory of Grébner bases it becomes easier to establish and relatively easy 
to implement computationally.7! We shall prove in Section 12 as a consequence 
of this investigation the deep theorem that a system of simultaneous homogeneous 
polynomial equations having more equations than variables always has a nonzero 
solution.” 

Hilbert associated a polynomial in one variable, now known as the “Hilbert 
polynomial,’ to each ideal of polynomials over an algebraically closed field. 
This polynomial encodes certain algebraic information about the ideal, and some 
features of this polynomial depend only on the geometry of the zero locus. In 
particular, the degree of the polynomial turns out to equal the geometric dimension 
of the zero locus, and that will be what interests us. 

The theory behind Grobner bases enables one to reduce the theory of the 
Hilbert polynomial to the case of a monomial ideal, for which it is relatively easy 
to understand.”? We begin with that case in this section. 

Let k be an algebraically closed field, consider affine space A”, and let a be 
an ideal in A = k[X,,..., X,]. In this section we shall be interested in the case 
that a is generated by monomials, in which case it is called a monomial ideal. 
The structure of monomial ideals is captured by Lemma 8.17, which says about 
such an ideal a that 


e for any polynomial f 4 0 in a, each monomial term contributing to f 


lies in a, 

e ahas a finite set of monomials as generators, 

e if {M,..., My} is a set of monomials that generate a and if M is any 
monomial in a, then some M; divides M. 

Let e1,...,, be the standard basis of A”, and let (e;,,..., e;,) be the lin- 
ear span of e;,,...,@;,. The vector space (e;,,...,é;,) is called a coordinate 
subspace of A”. The ideal py = (X,,...,X,) in A is prime, and its va- 
riety is V(px) = (€g41,---,@n). Since po C py C --: C fy is a strictly 


increasing sequence of prime ideals in A and since A has Krull dimension n, 


20Solutions to the other two problems are known as well. References may be found in Cox— 
Little-O’Shea. For determining the radical, see p. 177. For deciding whether an ideal is prime, see 
p. 207. 

2! The exposition in Sections 9-12 is based in part on Chapter 9 of the book by Cox—Little-O’ Shea 
and in part on Chapter I of Hartshorne’s book. 

2For one equation with two variables, this amounts to the Fundamental Theorem of Algebra. 
For two equations with three variables, it amounts to the existence part of Bezout’s Theorem as 
formulated in Theorem 8.5. 

?3Similarly the computations associated with Grobner bases make it possible to reduce the 
computation of the Hilbert polynomial of a general ideal to the computation of the Hilbert polynomial 
of a monomial ideal. 
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no strictly increasing sequence of prime ideals containing p;, can be longer than 
pe CS Prei CG --- © Py. It follows that the images of these ideals in A/p 
give a strictly increasing sequence of prime ideals of maximal length and that 
A/p has Krull dimension n — k. By Theorem 10.7 the geometric dimension of 
Vipr) = (ex41,---,@n) is nm — k. In other words, the geometric dimension of 
the vector subspace (ex41,...,@n) is the same as the vector-space dimension. 
Relabeling indices in this computation, we see that the geometric dimension of 
(€j,,+++5;,) IS k if the indices j,,..., j, are distinct. 

Let us compute the geometric dimension of the zero locus of a general proper 
monomial ideal (M,..., Mz). If a = (@1,...,@,) is a tuple of integers > 0, 
we write X® for Xf! +--+ X¢* and |a| fora; +--+ + a,. Let Hj = V(X;) be the 
coordinate hyperplane of points in A” with j" coordinate 0. This is the linear 
span of all e; fori # j, and it has geometric dimension n — 1. If a monomial X® 
is given, then Proposition 10.1 shows that 


V(X*%\= U V(xXj)= U Hi; 


aj;>0 aj>0 


and then that 


V(X", XP) = ( U Hi) n( U Hi) = U (N48). 


aj>0 B;>0 aj >0, Bj>0 


Similarly V(M,,..., M;) is a finite union of k-fold intersections of coordinate 
hyperplanes. By Theorem 10.7 the geometric dimension of V(M1, ..., Mx) is the 
maximum dimension of the subspaces Hj 1 H;M--- appearing in the appropriate 


union for M,,..., M;,. To get the maximum dimension, we want as few distinct 
indices to appear in an intersection H; 1 H;M---. Ifthe smallest possible number 
of distinct indices is m, then we see that V(M, ..., M;,) has geometric dimension 
n—m. 


The insight is that to study V (a), one studies A/a, and that to study the latter, 
one considers what happens as a function of s to the part of A/a that corresponds 
to degree at most s. In the case of a monomial ideal, this means that one is to study 
the monomials outside the ideal in question, particularly how the number of these 
monomials grows with s. Let M be the set of all monomials in k[X1,..., Xn]. 
For our monomial ideal a, let C(a) be the complementary subset to ain M given 
by 

CO ax" 27.e at. 


Proposition 10.64. If a is a proper monomial ideal in k[X,,..., X;,], then 
(a) the vector subspace V({X; |i é€ {ji,..., ik}}) is contained in V (a) if 
and only if {xe SM | MOVE s siens e;.)} is contained in C(a), 
(b) the geometric dimension of V (a) equals the largest vector-space dimen- 
sion of a coordinate subspace that lies in C(a). 
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REMARK. The hypothesis “proper” is needed for (b), not for (a). 


PROOF. For (a), first suppose that V({X; li €{ji,.--, jx}) is contained in 
V (a), and suppose that @ is in (e;,,..., @;,). Let P = (x1,...,X,) be the point 
with 
is fori € {j1,-.-, jes 
Xj _ . . . 
: 0 fori ¢ {j,,..., jx}. 


Then P is on the zero locus of each X; fori ¢ {ji,..., jg}, and hence P is in 
V (a). On the other hand, the value of the monomial X% at P is 1. Since the value 
of every member of a at P is 0, X® cannot be in a. Thus X® is in C(a). 

Next suppose that EF = V({X; li¢{fi,---; jx}}) is not contained in V (a). 
Say that P = (41,..., Xn) is in E but not V(a). The condition for P to be in E 
is that x; = O for alli ¢ {j1,..., jx}. Since P is not in V(a), some member of a 
is nonzero at P. The ideal is generated by monomials, and thus some monomial 
X™ in ais nonzero at P. Let ag = (aj,...,Q,). The (nonzero) value of ap on 
Pas. ecin er x;". Now x; = 0 for alli ¢ {j),..., j¢}, and consequently no i 
outside {j,,..., jx} can have a; > 0. Thus ap is in (e;,,..., €;,), and a exhibits 
{x eM I Sa cay €;,)} as failing to be contained in C(a). 

For (b), we saw before the proof that V (a) is the union of finitely many vector 
subspaces and that each vector subspace is an affine variety whose geometric 
dimension equals its vector-space dimension. By Theorem 10.7 the geometric 
dimension of V (a), a being proper, is the maximum of the dimensions of these 
subspaces. Taking (a) into account, we conclude that (b) holds. 


(*) 


We seek a formula for the number of monomials in C(a) of total degree < s 
when s is large and positive. We begin with a lemma. For a monomial ideal 
a, the function carrying each integer s > 0 to the number of X® in C(a) with 
|a| <_s is called the affine Hilbert function of a and is denoted by 7Ha(s, a). 
For a = k[X,,..., X,], the affine Hilbert function is identically 0, and we shall 
usually not be interested in this case. 


EXAMPLE. Forn = 1 with one indeterminate X , the proper ideals of kX] are 0 
and (X*) with k > 0. The monomials X“% with |a| < s are 1, X, X*,..., X°. By 
inspection, none of these is in aif a = 0, and thus H(,(s,0) = s + 1. In the case 
of (X*) with k > 0, the monomials X% in C((X)*) are 1, X,..., X*~!, and thus 
Ha(s, (X*)) iss +1 fors <k —landisk fors >k—1. 


Theorem 10.65. If a is a proper monomial ideal in k[X1,..., Xn], then the 
complementary set C(a) of monomials is a disjoint union 
C(a) = Co U---UC,, 
where C; is a finite union of subsets of the form 


E={X* EM |e (ej,...,€%) + Laie}. 
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Here it is assumed that (e;,,..., @;,) is a k-dimensional coordinate subspace and 
the coefficients a; are particular integers > 0. 


REMARKS. The subsets of / of which the above set E is an example will be 
called standard subsets of M with k parameters. The member eee ) ie 
of M is called the associated translation of E, and (e;,,...,e;,) is called the 
associated vector subspace of E. Standard subsets of MM with 0 parameters are 
singleton sets {X*}. An example of a standard subset of M with 1 parameter 
when n = 2 is {X{' X35” | a) > 0, ay = 2} = {X* | @ € (e1) + 2ep}. It 
is apparent that the one and only circumstance in which C,, is nonempty is that 
C(a) = M, in which case a = 0. 


PROOF. We proceed by induction on n, and we may assume that a 4 0. The 
example above shows for n = | that C(a) is a finite set if a is a nonzero proper 
ideal. Thus C(a) = Co in this case, and the base case of the induction is settled. 

Assume inductively that the theorem has been proved for n — 1 indeterminates, 
and let a be a nonzero ideal in k[X;,..., X,]. Let M,_, and M,, denote the 
sets of monomials in X1,..., Xn—1 and X,..., Xn, respectively. For j > 0, let 
a; be the ideal ink[X1,..., Xn—1] of all polynomials f(X1, ..., Xn—1) such that 
Xj} f (X1,..., Xn_-1) is in a. The ideals aj; are monomial ideals because a is a 
monomial ideal, and aj € aj+41 for all j. Since k[X1,..., X,—1] is Noetherian, 
there is some index / such that a; = a, for all 7 > /. We apply the inductive 
hypothesis to ao, aj, ..., a7, writing 


Cla) = Co,7 Ves UC forO <j <1. 


Here each C,, ; is a finite union of standard subsets with k parameters in the n — 1 
indeterminates X,,..., X,_}. 
Let Cx, j;X / be the set of all products of members of Cx, ; with X J We shall 
show that 
C(a) = CoU---UCy, (*) 


where Co,..., Cy are defined by 


oo : io | j 
Cri = es Crix? U Ey Cig? forO<k<n-1 
j=0 j=0 


and Co = C(a) = e) Cx. 
k=1 


But first let us see that each C;4, forO < k <n — 1 isa finite union of standard 
subsets of M,, with k + 1 parameters. Each C;41,; is a finite union of standard 
subsets of M,,-1 with some associated translation y such that y, = 0 and with 
an associated vector subspace (é;,,..., @j,,,) such that jj) < +--+ < jey1 <n. 


Then each Cy41,;X j is a finite union of standard subsets of M of the form X® 
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with associated translation y + je, and with the same associated vector space 
(j,,+++5 j,,)» Similarly the set Ujzo Cy X j is a finite union of standard subsets 
of M with associated translation y + Oe, and with associated vector space of 
the form (é;,,..., @j,, @n). Thus Cx+1 is a finite union of standard subsets of M,, 
with k + 1 parameters. 

Let us verify («). The most general monomial in k[X1,..., Xn] is X* xX! with 
XP? in k[X1,..., Xn—1], and this monomial is in a if and only if XP is in aj. 
Hence XX} is in C(a) if and only if X? is in C(a;). Since aj = a for j > J, 
C(aj) = C(a;) for j > 1. Thus 


C(a) = (Ucraxi) U (Ueipxi), (1%) 


If 7 <1, then XOX), € C(a) implies xbx! € C(a), since xa C a. Therefore 
C(aj) 2 C(a;) for all j < J, and we see that j < / implies that C(aj) = 
C(a;) UC(a;). Substituting into (**) and rearranging terms gives 


C(a) = (U cea) U (Uciapxi), ) 


For j <1, XP is in C(a;) if and only if X? is in one of Co,;,...,Cn—1,;. Thus 
we can rewrite (+) as 


C(a) = (U ( CuiXi) U (U U CxjXt) 


j=0 k=0 
oo n—-l ; I—1 n—2 ; bl , 
= (UU CuiX4) UCU U Cen sXt) UU CoX4). 
j=l k=0 j=0 k=0 j=0 


The first term on the right side contributes to C41, with e, to be adjoined to the 
basis vectors of the associated vector subspace (e;,,..., €;,). Equating the terms 
on the two sides that contribute to Cx,+1 therefore yields («). The set Co is the 
last term on the right side. This is finite because each Co, ; is finite, and therefore 
Co has the correct form. 


Lemma 10.66. Let E be a standard subset of MM with & parameters, and let 
y be its associated translation. Then the number of monomials X° with |y| < s 
such that @ is in EF is equal to the binomial coefficient 


. Hee ”") 

s= | 
if s > |y|. This expression is a polynomial function of s of degree k, and the 
coefficient of s* is 1/k!. 
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PROOF. Let (e;,,..., @;,) be the associated vector subspace for EL. The asso- 
ciated translation y is assumed to have y; = O fori in {j,..., jg}. We are to 
count monomials X* = X”X* with B in (e;,,...,e;,) and with |y + B| <s. 


Since |y| + |B] = |v + B| < s, the latter condition on £ is that |B| < s — |y|, 
which by assumption is > 0. The entries of 6 are allowed to be arbitrary nonzero 
integers in the k entries jj,..., jx, subject only to the limitation that the sum of 
the entries is to be < s — |y|. The number of such 6’s equals the number of 
homogeneous monomials in k + 1 variables of total degree equal to s — |y|. This 
number is recalled in a bulleted list in Section 3 and is (* A Es) = cae an 
When expanded out, this binomial coefficient equals 


a@tk—ly)@+k-1-ly)---@+1-IyD, 


which is a polynomial function of s of degree k with leading coefficient 1/k!. 


Lemma 10.67. Let E and F be standard subsets of M with k and / parameters, 
respectively. Then E / F either is empty or is a standard subset of M with m 
parameters, where m < min(k,/). Moreover, the only way that m can equal 
max(k, /) is for E to equal F’. 


PROOF. Denote the respective associated translations for E and F by yg and 
yr, and let Sz and Sp be the subsets of {1,...,} such that (e; | i € Sg) and 
(e; | i © Sr) are the associated vector spaces for EF and F’,, respectively. Let Tz 
be the subset of indices 


Tr = {i € {1,...,n}| (ve): > O}, 


and define 7, similarly. We are given that |S_| = k and |Sr| = /. Also, we are 
given that Se 1 Tg = @ and Sr NTr = @,ie., that Tg C Si and Tr C S*,. If 
EQ F 4 @, then there exist x and y with 


YVetx=vrty such that x; = O fori ¢ Se and yj =Ofor j ¢ Sr. (*) 


Then x; = y; = Ofori € SM S%, and we see that a necessary condition to have 
EF ¢ @is that (ye); = (vr); fori € S%O St. In this case the x and y in (*) 
must have x; = (yr); fori € Se N S% and y; = (ye); fori € SEO Sp. 

Conversely if (ye); = (vr); fori € Si S%, then we can define x; = (yr); 
fori € Se Si, yi = (ve)i fori € SiO Sr, and x; = y; to be arbitrary for 
i € Sg 1 Sf, and we obtain solutions of (*). It is evident that all solutions of 
(*) are obtained this way. Consequently E /M F is the standard subset of M with 
[Sz O Sr| parameters; with associated translation y having y; equal to yg on S*,, 
equal to yr on Sj, and equal to 0 on S_ M S-; and with associated vector space 
(e; |i € S), where S = Se Sp. 

The inequality dimy(Sz¢ 9 Sr) < min(dim, Sz, dim, Sr) is the inequality 
m < min(k, 1) of the lemma. If m = max(k, /), then we must have § = Sg = Sr 
and an equality (yg); = (vr); fori ¢ SO S*,,i., fori ¢ S. The latter equality 
implies that yg = yr. Hence EF = F. 
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Theorem 10.68. If a is a monomial ideal in k[X 1, ..., X,] such that V (a) has 
geometric dimension d, then there exists a polynomial H,(s, a) in one variable 
of degree d such that the affine Hilbert function 7/,(s, a) is equal to H,(s, a) for 
all positive s sufficiently large. The leading coefficient of H,(s, a) is positive. 


REMARK. The polynomial H,(s, a) is called the affine Hilbert polynomial 
of the monomial ideal a. It is of course uniquely determined. 


PROOF. For s sufficiently large, we are to count the number of monomials 
X* with |a| < s lying in the complementary set C(a) to a. Proposition 10.64b 
and Theorem 10.65 together show that C(a) = Co U--- U Cg disjointly, with Cx 
equal to a finite union of standard subsets of MM with k parameters and with Cy 
nonempty. The sets C, being disjoint, it is enough to show that the number of 
such monomials in C; is a function equal for large s to a polynomial of degree k, 
provided C; is nonempty. 

According to Lemma 10.66, if E is a standard subset of M with k parameters, 
ifs > Ois sufficiently large, and if y is the translation parameter, then the number 
of monomials X% in E with |a| < s is Co) if s > |y|, which is a polynomial 
of degree k with positive leading coefficient. 

Because the sets E of this kind whose finite union is C, may not be disjoint 
and because we seek an exact answer for the cardinality |C;,| when s is large, we 
cannot simply add finitely many such expressions to obtain a value for |C;|. We 
have to take into account the overlaps of the various sets E. Thus suppose that 
C, = E, U---UE, for standard subsets F),..., E, of M with k parameters. 
Without loss of generality, we may assume that no two of the sets E,,..., E, are 
equal to one another. Let Fi(s),..., E,-(s) be the respective subsets of elements 
a with |a| <s. We use the inclusion—exclusion formula, namely 


l 


’ 


U £i6)| = CIB“ D ENE, WD CY YD 
i=l i 1=3 


1, <i2 I<: <j 


E;,(s) 
1 


j= 


this is a formula in Boolean algebra that is readily proved by induction on r 
starting from the formula |E U F| = |E|+|/F|—|ENF|. 

Lemma 10.66 shows that )°; | £;(s)| isa sum of functions equal for large s > 0 
to polynomials of degree d with positive leading coefficient. The leading coeffi- 
cients cannot cancel, and thus the sum is for large s > 0 equal to a polynomial 
of degree d with positive leading coefficient. Each of the remaining terms on 
the right side of the inclusion—exclusion formula, according to Lemma 10.67, is 
plus or minus the number of monomials a with |a| < s in some standard subset 
E of M whose number of parameters is < d. Hence the sum of all those terms 
is a function equal for large s to a polynomial that is 0 or has degree < d. The 
theorem follows. 


626 X. Methods of Algebraic Geometry 


Proposition 10.69. A polynomial P(s) in one variable of degree d takes 
integer values for s sufficiently large and positive if and only if it is an integer 
linear combination of the polynomials s +> (5) for0 < j <d. 

PROOF. The sufficiency is immediate because (5) is an integer for each j and 
s. For necessity, suppose that P(s) is integer-valued and has degree d. Since 
SH () is integer-valued of degree j with leading coefficient 1/j!, P(s) is 
certainly a rational linear combination of the polynomials s +> (ae We prove 
by induction on d that the coefficients are integers. For deg P(s) = 0, we have 


*) = 1, and there is nothing to prove. Given an integer-valued P(s) of degree 
0 Pp g 


d, write P(s) = ear qj (5). Form 


d 
AP(s) = P(s +1) — Ps) = > ai|($') -()|= 
I= 


the third equality holding by Pascal’s triangle. Since AP(s) is integer-valued 
and has degree d — 1, the inductive hypothesis shows that aj; is an integer for 


0 < j <d—1;ie.,a; is an integer for 1 < j < d. Therefore QO(s) = i aj (5) 
is integer-valued. Since P(s) — Q(s) = ao is integer-valued and constant, ao is 
an integer. 


Corollary 10.70. If a is a monomial ideal in k[X,,..., X,] such that V (a) 
has geometric dimension d, then the affine Hilbert polynomial H,(s, a) of ais of 


the form H,(s, a) = ee aj ( ea) with integer coefficients aj and with ap > 0. 


PROOF. This follows by combining Theorem 10.68 and Proposition 10.69. 


10. Hilbert Polynomial in the Affine Case 


We continue with an algebraically closed field k and with the polynomial ring 
A=k[X,..., Xn]. Let a be an ideal in A. For each integer s > 0, let A<,; be 
the vector subspace of A consisting of 0 and all elements of degree at most s , and 
put a<; =a A<,. The inclusion of A<, into A descends to a k linear mapping 
A<s/Q<; —> A/a, and this is one-one because Az, Ma C a<,. Thus we can 
regard A<,/d<;, aS 5 varies, as a sequence of successively better approximations 
to A/a. We define the affine Hilbert function 7+/,(s, a) of a by 


Hals, a) = dimy A<s/d<s for s > 0. 


When a is a monomial ideal, this function is the one that was investigated in 
the previous section. In fact, the monomials of degree < s form a vector-space 
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basis of A<,, and the monomials in a of degree < s form a basis of a<; because a 
is spanned by monomials. If C(a) denotes the set of monomials not in a, then the 
monomials of degree < s within C(a) descend to a basis of A<,/a<;. The number 
of such monomials gives the value of the affine Hilbert function as defined in the 
previous section, and thus the new definition is consistent with the old one in the 
case of monomial ideals. 

When a is a proper monomial ideal, we found in Theorem 10.68 that H(,(s, a) 
equals a polynomial function of s for s sufficiently large and that the degree of 
this polynomial function equals the geometric dimension of the zero locus V (a) 
in the affine space A”. Our goal in this section is to show that these conclusions 
remain valid for all proper ideals a. The polynomial function that results for such 
an a will be called the affine Hilbert polynomial of a. 

We shall make the connection between general ideals a and monomial ideals 
by means of the theory of Sections VIII.7—-VII.10. We recall the notion of a 
monomial ordering as defined in Section VHI.7. A monomial ordering < is said 
to be a graded monomial ordering if |8| < || implies X? < X%. The graded 
lexicographic ordering and the graded reverse lexicographic ordering (Examples 2 
and 3 in Section VIII.7) are examples of graded monomial orderings, but the 
lexicographic ordering in Example 1 in that section is not a graded monomial 
ordering. 

Fix a graded monomial ordering. As in Section VIII.7, LT(f) denotes the 
leading monomial term of the polynomial f. By convention, LT(O) = 0. For our 
ideal a, we let LT(a) be the vector space of all linear combinations of polynomials 
LT(f) for f € a. This is an ideal in A, and it is amonomial ideal. The connection 
between the goal of this section and the results of the previous section rests on 
the following remarkable theorem. 


Theorem 10.71 (Macaulay). Let a graded monomial ordering be imposed 
on k[X1,..., Xn]. If a is any ideal in k[X1,..., X,], then the affine Hilbert 
functions of a and LT(a) coincide: H,(s, a) = Ha(s, LT(a)). 


PROOF. Fix s > 0. It is enough to prove that a<, and LT(a)<, have the same k 
dimension. Since there are only finitely many monomials of degree < 5, we can 
choose f1,..., fm in a such that their leading monomials LM(f}), ..., LM(fx) 
are distinct and form a vector-space basis of LT(a)<;. Without loss of generality, 
we may assume that LM(f;) > --- > LM(f;). Certainly dimLT(a)<, = k, and 
thus it is enough to show that f;,..., f; lie in a<, and form a vector-space basis 
of des. 

For each j, LM(f; — LT(fj)) < LM(f;). Since the monomial ordering is 
graded, this inequality implies that deg( f; — LT(fj)) < s. But we know that 
deg (LT(f;)) < s, and therefore deg f; < s. Consequently fj lies in a<,. 


To prove that { f;,..., f,} is linearly independent, suppose that ae cj fj; =0 
with all c; ink. Arguing by contradiction, suppose that not all c; are 0. Let i be the 
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least index j for which c; # 0; then LM(f;) = LM(c; fi) = LM ( — Dee Gf) s 
max;>; LM(f;), and we arrive at a contradiction. We conclude that {fi weg cf} 
is linearly independent. 

To prove that {f1,..., f,} spans a<,, we again argue by contradiction. Among 
all g in a<, with g not in the linear span of {f),..., f;}, choose one for which 
LM(g) is the smallest. Certainly LM(g) is one of LM(f)),..., LM(fx). Say that 
LM(g) = LM(f;). For some scalar c 4 0, we must have LT(g) = LT(cf;). Then 
LM(g — cfi) < LM(g), and the minimality of LM(g) forces g — cf; to be in the 
linear span of {f1,..., fx}. Since cf; is in the linear span, so is g, contradiction. 
Thus {f1,..., fx} is a spanning set of a<,. 


Corollary 10.72. If ais an ideal in k[X1,..., Xn], then for all s sufficiently 
large, the affine Hilbert function H,(s, a) of a equals a polynomial in s of the 
form sea a;( =) with integer coefficients a; and with ap > 0. 


REMARKS. The polynomial in the statement of the corollary is called the affine 
Hilbert polynomial of a and is denoted by H,(s, a). It is the 0 polynomial if and 
only if a = k[X1,..., Xn]. 


PROOF. Theorem 10.71 says that H,_(s, a) = Ha(s, LT(a)). Consequently the 
result follows immediately by applying Corollary 10.70 to LT(a). 


Corollary 10.73. If a graded monomial ordering is imposed on k[X1, ..., Xn] 
and if a is any ideal in k[X,,..., X;,,], then the affine Hilbert polynomials of a 
and LT(a) coincide: H,(s, a) = H,(s, LT(a)). 


PROOF. This is immediate from Theorem 10.71 and the definition of the affine 
Hilbert polynomial given in the remarks with Corollary 10.72. 


Corollary 10.74. If a and 6 are proper ideals of k[X,,..., X,] such that 
a C 6, then deg H,(s, a) > deg Ha(s, 6). 


PROOF. Introduce a graded monomial ordering. The inclusion a C 6 implies 
that LT(a) C LT(6). Therefore C(LT(a)) > C(LT(6)). Proposition 10.64b shows 
that the geometric dimension of V (LT(a)) is the largest vector-space dimension 
of a coordinate subspace that lies in C(LT(a)), and the same thing is true for 
LT(6). Thus the geometric dimension of V (LT(a)) is > the geometric dimension 
of V(LT(6)). By Theorem 10.68, deg H,(s,LT(a)) > deg Hy(s, LT(6)). The 
result now follows immediately from Corollary 10.73. 


The affine Hilbert polynomial H,(s, a) of a depends on a, not just V(a), but 
we shall be interested mainly in the degree of H,(s, a). Proposition 10.76, as 
amplified in Corollary 10.77, implies that the degree depends only on V(a). It 
requires a lemma. 
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Lemma 10.75. If a is a monomial ideal in k[X1,..., X,], then so is a. 


PROOF. The preliminary remarks in Section 9 show that V (a) is a finite union 
of coordinate subspaces. Let us write V(a) = LU |; £j accordingly. By Proposition 
10.2b, fa = I(V(a)) = 1(U; Ej) = 1; 1(E;). Since E; is an affine variety 
and is equal to V(X;,,..., Xi.) for suitable Xi,,..-, Xi,, the Nullstellensatz 
shows that /(£;) is an ideal of the form /(F) = (Xj,,...,X;,). This is a 
monomial ideal, and it is therefore enough to show that the finite intersection of 
monomial ideals is a monomial ideal. By induction it is enough to show that 
b cis amonomial ideal if 6 and ¢ are monomial ideals. If an element of 6 c is 
given, then that element is a linear combination of the monomials in 6 and is also 
a linear combination of the monomials in c. Since M is linearly independent, the 
element is a linear combination of monomials lying in 6 Mc. Therefore 6M cis a 
monomial ideal. 


Proposition 10.76. If a is a proper ideal in k[X1,..., X;,], then the degrees 
of the affine Hilbert polynomials H,(s, a) and H,(s, ./a) are equal. 


PROOF. Fix a graded monomial ordering. We begin by proving that 


LT(a) C LT(/a) C LT (a). (x) 


The left-hand inclusion is immediate because a C ./a. For the right-hand 
inclusion, let f 4 0 be in \/a, and let X“ = LM(f) be the leading monomial of 
f. Since f isin ./a, f” is ina for somer > 0. Since the leading monomial of a 
product is the product of the leading monomials, LM(f") = X’*. Thus a power 
of X% is exhibited as in LT(a), and X° is in /LT(a) . This proves (*). 

Applying Corollary 10.74 to («), we obtain 


deg Ha(s, LT(a)) > deg Ha(s, LT(/a)) > deg Ha(s, VLT(a)). —— () 
The ideal LT(a) is a monomial ideal, and Lemma 10.75 shows that /LT(a) is a 


monomial ideal. Then LT(a) and ./LT(a) are monomial ideals with V(LT(a)) = 
V(/LT(a) ), and Theorem 10.68 shows that 


deg H,(s, LT(a)) = deg H,(s, VLT(a) ). 
Comparing this conclusion with (+), we see that 
deg Ha(s, LT(a)) = deg Ha(s, LT(/a)). (H) 


In combination with the equalities H,(s,a) = H,(s,LT(a)) and Hy(s,./a) = 
H,(s, LT(./a)) given by Corollary 10.73, (+) completes the proof. 
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Corollary 10.77. If a and 6 are proper ideals in k[X1,..., X,] with V(a) C 
V (6), then deg H,(s, a) < deg H,(s, 6). 


PRoor. Application of /(-) to the inclusion V(a) C V(b) gives /a = 
I(V(a)) D I(V(6)) = vb. Then Corollary 10.74 and Proposition 10.76 together 
yield deg Hy (s, a) = deg Ay(s, Ja) < deg Ay (s, vb) = deg H,(s, 6). 


Theorem 10.78. If a is a prime ideal in k[X;,..., X,], then the degree of the 
affine Hilbert polynomial H,(s, a) equals the geometric dimension of the affine 
variety V (a). 


PROOF. Define d = deg H,(s, a) and V = V(a), and let A(V) be the affine 
coordinate ring A(V) = k[X,,..., X;]/a. Theorem 10.7 shows that dim V 
equals the Krull dimension of A(V), and Theorem 7.22 shows that the latter 
equals the transcendence degree over k of the field of fractions k(V) of A(V). 
Thus the theorem will follow if we show that k(V) has transcendence degree d 
over k. 

Let g : k[X,,..., X,] — A(V) be the quotient homomorphism, and put 
x; = 9(X;) for 1 < i <n. Introduce a graded monomial ordering on M. 
Corollary 10.73 shows that H,(s, a) = Ha(s, LT(a)), and Theorem 10.68 shows 
that V (LT(a)) has geometric dimension d. We saw in Section 9 that the zero locus 
of a monomial ideal is the finite union of coordinate subspaces, and it follows 
that V(LT(a)) C A” contains a coordinate subspace E of dimension d. Let E 
have as basis the standard vectors e;,,..., éj,, 80 that 


E=V({X: li ¢ (i... ja}}). 


The set E is a variety, and thus /(£) = ({X; li é {fi,..-,; da} Also, E € 
V(LT(a)),andhence /(£) > I(V(LT(a))) D LT(a). If X°% isa monomial in LT(a), 


then it follows that X°% lies in the ideal generated by the X; fori ¢ {j1,..., ja}. 
We can summarize this fact as follows: if we write k[X;,,..., Xj,] for the subring 
of k[X1, ..., Xn] of polynomials involving only X;,,..., X;,, then 
LT(a) NK[X;,,..., Xj,] = 9. (-) 
If f is any nonzero member of k[X;,, ..., X;,], then its leading monomial LM(f) 
has to lie in kLX;,,..., Xj,], and thus (*) implies that 
aNk[X;,,...,Xj,] =0. (4) 


Using (**) and notation introduced at the beginning of Section VII4, we 
shall show that x;,,..., xj, are algebraically independent over k, and then it 
follows that d < tr.deg A(V). Thus suppose that 9(Y;,..., Ya) is a polynomial 
in k[Y|,..., Ya] such that g(x;,,...,xj;,) = 0. We can identify k[Y,..., Ya] 
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with k[X;,,..., X;,] © k[X1, ..., Xn], and then the equality g(xj,,...,x;,) =90 
means that g(g) = 0,i.e., g isin a. Hence g is amember of aNk[X;,,..., Xj,]. 
and g = 0 by (««). Therefore x;,,..., x;, are algebraically independent over k. 


For the reverse inequality, we are to prove that d > tr.deg A(V). Letr = 
tr.deg A(V). The elements x; = g(Xj;) generate A(V) as a k algebra, and 
therefore they generate k(V) over k as a field. By Lemma 7.6b some subset 
{xj,,-.+,Xj,} Of {x1, ..., Xn} is algebraically independent. Consider the substi- 
tution homomorphism 

wh) = h(xj,,---5X;,) 


of k[Yi,..., ¥-] into A(V). This is one-one because the elements x;,,..., xj, by 
assumption are algebraically independent. Fix s > 0, and consider the restriction 
of Y tok[Y,..., Y-Jes. IfhA(%1,..., Y,-) is a monomial Y® in k[Yj,..., Y;J<s 
with a = (a@,,...,a@,) and |a| < s, then we see that 


~ 


w(Y%) — Tx = e( ord: 


ll 


In other words, w(Y%) is the image under y of a member of k[Xj,..., X,] of 
degree < s. Taking linear combinations of such monomials, we see that y(h) is 
a one-one k linear mapping 


wik[Y,...,¥-les > KLX1,..., Xnles/d<s © ACV). 
Therefore 


H,(s, a) = dimy (k[X1,..., Xnles/<s) = dime K[V1,..., YrJes = ("7°). 


: 
The binomial coefficient on the right side is a polynomial of degree r in s with 


positive leading coefficient. The left side is a polynomial in s of degree d. The 
inequality forces d > r, and the proof is complete. 


Proposition 10.79. If a and 6 are proper ideals in k[Xj,..., X,], then 
deg H,(s, ab) = max (deg H,(s, a), deg Ha(s, 6). 

REMARKS. Proposition 10.1 points out that V(ab) = V (a) U V(6). Since the 
degree of the affine Hilbert polynomial of a depends only on V (a), this proposition 
says that the degree associated with the union of two affine algebraic sets is the 
larger of the degrees associated with each of the sets. 


PROOF. Impose a graded monomial ordering on M. Let us check that 
(LT(a))(LT(6)) C LT(ab) C LT(aN b) C ,/(LT(a))(LT(b)). (*) 
In fact, let f be in a and g be in 6, and define X* = LM(f) and X? = LM(g) 


to be the leading monomials of f and g. Then X°+* = LM(fg), and hence 
the product of any generator of LT(a) and any generator of LT(6) lies in LT(ab). 
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This proves the first inclusion of (*). The second inclusion is immediate because 
ab Canb. If X° = LM(f) with f € aM b, then (X%)* = LM(f) LM(f) is in 
LT(a) LT(b). Hence X® is in ,/ (LT(a)) ( LT(6)). Thus a generating set of LT(aN b) 
lies in ,/ (LT(a)) ( LT(6)), and the third inclusion of (*) follows. 


In (x), the values of V(- ) on the end two members are the same, according to 
Proposition 10.3c, and therefore 


V(LT(a) LT(6)) = V(LT(ab)). (4x) 
The proposition now follows from the computation 


max (deg H, (s, a), deg Hy (s, b)) 
= max(deg H,(s, LT(a)), deg Ha(s, LT(6))) by Corollary 10.73 


= max(dim V (LT(a)), dim V (LT(6))) by Theorem 10.68 
= dim (V(LT(a)) U V(LT(6))) by Theorem 10.7 

= dim(V (LT(a) LT(6)) by Proposition 10.1c 
= dim V (LT(ab)) by («*) 

= deg H,(s, LT(ab)) by Theorem 10.68 
= deg H,(s, ab) by Corollary 10.73. 


Corollary 10.80. If a is any ideal in k[X,..., X,], then the geometric 
dimension of the affine algebraic set V (a) equals the degree of the affine Hilbert 
polynomial H,(s, a). 


PROOF. Write V(a) = am V; as a finite union of affine varieties V;, and de- 
fine p; = I(V;). Since V; is irreducible, p; is prime. Moreover, Vj = V(I(V;)) = 
V(p;). Then Proposition 10.1c shows that V(pip2---px) = ea Vip) = 
Os V; = V(a). Proposition 10.79 and induction give 


deg Ha(s, pip2-+- Px) = hp deg Ha(s, pj), 


and Theorem 10.78 shows that the right side equals max;<j;<,dimV(p;) = 
max) <;<, dim V;, which equals dim V (a) by Theorem 10.7. 


As a consequence of Corollary 10.80, we obtain an algorithm for computing 
the dimension of an affine algebraic set V when given an ideal a whose locus 
of common zeros V(a) is V: We introduce any graded monomial ordering and 
compute LT(a), using a Grobner basis. Corollaries 10.73 and 10.80 together say 
that dim V (a) = dim V (LT(a)). The remarks before Proposition 10.64 show how 
to compute dim V (LT(a)), and Proposition 10.64b gives an alternative method of 
computation. 


11. Hilbert Polynomial in the Projective Case 633 
11. Hilbert Polynomial in the Projective Case 


In this section we consider the analog for projective space of the theory of 
Section 10. We continue with k as an algebraically closed field, and we let 
A = k[Xo,..., Xn]. Our interest is in the zero locus V (a) in P", as defined in 
Section 3, of a homogeneous ideal a in A. To relate matters to Section 10, we 
shall make use of the cone C(V (a)) over V(a), which was defined in Section 3 
as 


C(V(a)) = 0,...,0) U{ (xo, ...,%n) €A”* | [x0,..., 2n] € Via}. 


The homogeneous ideal a is in particular an ideal in n + 1 variables, and its 
associated affine algebraic set is the subset C(V(a)) of A”*!. An affine Hilbert 
polynomial H,(s, a) is therefore associated to C(V (a)), and its degree matches 
the geometric dimension of C(V (a)). 

To get something directly related to the projective algebraic set V (a) in pro- 
jective space P”, we make a new definition of Hilbert function. Let A; = 
k[Xo,---, Xn]s be the subspace A of all polynomials homogeneous of degree 
s. If ais a homogeneous ideal in A, let a; = aM A,. The Hilbert function’ of 
a is the integer-valued function of s > 0 defined by 


H(s, a) = dim, A, /ds for s > 0. 


We have Ae = A, ®@ Aven: and the fact that a is homogeneous implies that 
Q<s = ds ® d<s_1. Consequently A<s/d<s = As/ds ® A<s—1/d<s—1. Therefore 


H(s, a) = Ha(s, a) — Ha(s — 1, a). 


This is the fundamental formula by which the algebraic part of the theory of the 
Hilbert function in the projective case can be reduced to the corresponding theory 
in the affine case. 

We know that the affine Hilbert function is a polynomial for large s. Since 


Ca Gat apt taf ae EO! 


is a polynomial of one lower degree and with positive leading coefficient, it 
follows that the Hilbert function of a is a polynomial for large s, that its degree 
is dim C(V(a)) — 1, and that its leading coefficient is positive. This polynomial 
is called the Hilbert polynomial of a and is denoted by H(s, a). To connect the 
geometric part of the theory of the Hilbert function in the projective case to the 
corresponding theory in the affine case, we use the following proposition. 


>4Tt is traditional not to include the word “projective” or any subscript, even though the termi- 
nology is meant to refer to the projective case. 
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Proposition 10.81. If a is a homogeneous ideal in k[Xo, ..., X,] and if the 
corresponding projective algebraic set V (a) is nonempty, then 


dim C(V (a)) = dim V(a) + 1. 


PROOF. The proof of Corollary 10.13 shows that C(V(a)) is irreducible in 
A"*! if and only if V(a) is irreducible in P”. Since the dimension in both cases 
for a general a is the maximum of the dimensions of irreducible closed subsets, 
it is enough to prove the dimensional equality in the irreducible case. 

If we have a strictly increasing sequence of irreducible closed subsets Ey G 
E, G--- & Eq in P", then each C(E)) is irreducible in A"*!, and the sequence 
C(Eo) & C(E\) «+» & C(Eq) in A"*! consists of Zariski closed sets that are 
irreducible. Since the subset {0} of A”*! is irreducible and can be adjoined at the 
beginning of the latter sequence, we conclude that dim C(V (a)) > dim V(a) +1. 

We need to prove the reverse inequality in the irreducible case. Since V (a) is as- 
sumed irreducible (and hence nonempty), we may assume that a is prime and omits 
at least one of Xo, ..., X,. To fix the notation, say that Xo is not in a. Recall from 
Section 3 the substitution homomorphism fj : k[Xo, ..., Xn] > k[X1,..-, Xnl 
formed by setting Xo = 1. Let b = 8) (a). This is a prime ideal ink[X),..., Xn], 
according to Theorem 10.20. Let A(C(V (a))) = k[Xo, ..., X,]/a and A(V (6)) 
= k[X1,..., Xn]/6. The homomorphism Bi descends to a homomorphism of 
A(C(V (a)) onto A(V (6)), which we denote by Bi: 

Let x0, ...,X, be the images of Xo,..., X, in A(C(V(a))). The element xo 
is transcendental over k. In fact, the only alternative is that it is a scalar c, since 
k is algebraically closed; the equality x» = c would imply that Xo — c is ina, 
and the fact that a is homogeneous would imply that Xo and c are separately 
in a, in contradiction to our choice of Xo. Consequently k(xo)(%1, ..., Xn) 
has transcendence degree r = dimC(V(a)) — 1 over k(xo). Since x1,..., Xp 
generate k(xo)(x1,...,%,) a8 a field over k(xg), some subset {xj,,...,x;,} of 
{x1,...,X»} is a transcendence basis of k(xo)(x1,..., Xn) aS a field over k(xo). 
Thus {xo, x;,,..., %;,} is a transcendence basis of k(xo, ..., Xn) over k. 

The elements xo, x;,,...,%;, all lie in A(C(V(a))), and we consider their 
images 1, Bi(x;,),---, Bj(x;,) in A(V(6)). Suppose that A(Y1,...,Y,) is a 
polynomial in r variables exhibiting the last r of these images as algebraically 
dependent. That is, suppose that 


n(Bb,). + B,)) = 0. ms 


Let h have degree d. We regard h as a member of k[X1, ..., XnJ<a that depends 
only on X;,,..., X;,. With this notational change, (+) reads 


h(X,..., Xn) is in b. (+) 
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We now refer to the details of the proof of Theorem 10.20 that are summarized 
before Proposition 10.33. The linear mapping gg with ga(f)(Xo,..., Xn) = 
X4 f(X1/Xo, ...,X,/Xo) is a two-sided inverse to Bj : k[Xo0,...,Xnla > 
k[X,,..., X,J<q. Put H = gq(h), so that h = B}(H). The detail in question is 
that 

aNk[Xo,...,Xnla = Ga(6NK[X1,..., Xnlea). (+) 


By («*), @a(/) is in the right side of (+). Since () is a valid identity, gg (h) is in the 
left side. So H is ina. This means that H (xo, ..., X,) = 0. Remembering that H 
depends only on Xo, X;,,..., Xj, and that {xo, x;,,..., x;,} isa transcendence set, 
we see that H = 0. Therefore h = 0, and { Bi, Oj eres Bi, (x;,)} is a transcendence 
set in A(V(6)). Thus 


dim V (6) = tr.deg A(V (6)) > r = tr.deg A(C(V(a)) — 1 = dimC(V(a)) — 1. 


By Corollary 10.19, dim V (6) = dim V(a). Hencedim C(V(a)) < dim V(a)+1, 
and the proof is complete. 


Corollary 10.82. If a is a homogeneous ideal in k[Xo0,..., X,] and if the 
corresponding projective algebraic set V(a) is nonempty, then dim V (a) equals 
the degree of the Hilbert polynomial H(s, a). 


PROOF. This is immediate from Proposition 10.81 because dimC(V(a)) = 
dim H,(s, a) and because deg H(s, a) = deg Ha(s, a) — 1. 


We could also obtain a corollary relating H(s, V(a)) and H(s, V (LT(a))) when 
a graded monomial ordering is imposed, and we could then give a geometric way 
of visualizing the dimension in terms of the projective case. But we shall not 
need these details, and we omit them. 


12. Intersections in Projective Space 


Hilbert polynomials are an appropriate tool for dealing with how a projective 
algebraic set intersects a lower-dimensional projective space. In this section we 
consider such intersections, and we obtain as a corollary the deep result that a 
system of homogeneous polynomial equations over an algebraically closed field 
k always has a nonzero solution if there are more variables than equations. 

It will be convenient in this section to adopt the convention that the empty 
projective algebraic set has dimension —1 and that the 0 Hilbert polynomial has 
degree —1. To make use of this convention, we recall from the homogeneous 
Nullstellensatz (Proposition 10.12a) that ahomogeneous ideal aink[Xo, ..., Xn] 
has V(a) empty in P” if and only if there is an integer N such that a contains 
k[X0,..., Xn], fork => N. In this case our definition makes C(V (a)) consist 
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of {0} alone.2> With the convention that such ideals have dim V(a) = —1 and 
C(V(a)) = {0}, the formula of Proposition 10.81 remains valid, and we can 
therefore drop the assumption that V(a) is nonempty. As to Corollary 10.82, 
the definition of the Hilbert function when a contains k[Xo,..., Xn], for all 
sufficiently large k makes 7{(k, a) = 0 forsuchk; therefore the Hilbert polynomial 
in this case is the 0 polynomial, and Corollary 10.82 continues to be valid even 
when V (aq) is empty. 


Theorem 10.83. If a is any homogeneous ideal in k[Xo,..., X,] and if F is 
a homogeneous polynomial, then 


dim V(a) > dim V (a+ (F)) > dim V(a) — 1. 
In particular, V(a + (F)) is nonempty if dim V (a) > 1. 


PROOF. Since a C a+ (F) and since V(-) is inclusion reversing, we know 
that 
dim V (a) > dim V (a+ (F)). 
To obtain the second inequality of the theorem, we shall compare the Hilbert 
polynomials H(s,a) and H(s,a-+ (F)), taking advantage of Corollary 10.82. 
Let d = deg F’, and suppose that s > d. The identity mapping on k[Xo0, ..., XnJs 
descends to a k linear mapping 


Qg: k[Xo, sea Xnls/s az k[Xo, eae) Xnls/(at (F))s, 


and ¢ is onto, being formed from an onto map. To understand ker @, we shall use 
the k linear map 


wv : k[Xo, sey Xn|s—a/Gs—a ad k[Xo, sees Xnls/As 


induced by multiplication by F’, which we view as carrying k[Xo0,..., Xnls—a 
into k[Xo,..., Xn]s/as. Observe that if G is in k[Xo,..., Xn]s—a, then FG is 
in (a+ (F));, and therefore g o wy = 0,i.e., image y C kerg. 

We shall prove that equality holds. Thus suppose that G is a member of 
k[Xo0,..., X,]; such that G + a, is in ker@g, ie., that G is in (a+ (F)),. Then 
we can write G = G, + HF with G, in a, and H ink[Xo,..., Xn]s—g. So 
G — G; = HF, and the coset G+ a, = G— G, + a, is W of H + a;_g. We 
conclude that image y = ker g. 

Now we compute 


dimyk[Xo,..-, Xnls/as 
= dim, (domain g) = dim, (ker y) + dim, (image ¢) 
= dim, (image w) + dimy k[Xo, ..., Xnls/(a+ (F))s 
< dimy k[Xo0, ..-, XnJs—a/ds—¢ + dimy k[Xo,..., Xnls/(a+ (F))s. 


25 Admittedly the inclusion of {0} in the cone might seem unnatural if a = k[Xo,..., XJ, but 
that is the definition that makes this particular a behave like all other ideals. 
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In terms of Hilbert functions, this says that 

H(s,a) < H(s —d,a)+H(s,a+(F)). 
For large s, this is an inequality of polynomials: 

H(s,a) < H(s —d,a)+ H(s,a+ (P)). 


Since H(s, a) — H(s —d, a) is a polynomial of one lower degree than H(s, a) 
with leading coefficient positive, we obtain 


deg H(s,a) —1 < deg H(s,a+(F)). 


The second inequality of the theorem now follows from Corollary 10.82. The 
final assertion in the theorem takes into account the remarks in the paragraph 
preceding the statement of the theorem. 


Corollary 10.84. If a is any homogeneous ideal in k[Xo,..., X,] and if 
F\,..., F, are homogeneous polynomials, then 


dim V(a) > dim V(a-+ (F),..., F,)) => dim V(a) —r. 


In particular, V(a+ (F|,..., F)) is nonempty if dim V(a) > r. 


PROOF. We use Theorem 10.83 inductively, first applying it to the ideal a with 
F = F;,, then applying it to the ideal a + (F1) with F = Fy, and so on. This 
proves the first conclusion, and the second conclusion follows because of the 
convention that the empty set has dimension —1. 


Corollary 10.85. Over an algebraically closed field any system of homoge- 
neous polynomial equations with more variables than equations has a nonzero 
solution. 


PROOF. Let there be r equations and n + | variables with n + 1 > r, the 


equations being F; = 0,..., F- = 0. The zero locus for each equation is a subset 
of P”. Applying Corollary 10.84 with a = 0 shows that dim V(F,,..., F-) = 
n—r >Oand that V(F|,..., F,) is not empty as long asn > r. 


Corollary 10.85 is the result in the present chapter that was anticipated in 
Problem 23 at the end of Chapter VIII. 
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13. Schemes 


We conclude with some commentary about “schemes.” The subject of algebraic 
geometry studied along the lines of Sections 1-12 suffers from at least two 
shortcomings. One concerns the coefficients that are involved. The original 
impetus for the subject came from systems of polynomial equations in several 
variables. These equations involve addition, subtraction, and multiplication, and 
the requirement that division be allowable is unnatural and cuts down the scope 
of the subject. It immediately cuts out Diophantine equations, for example, to 
say nothing of congruences modulo prime powers. It would be more natural to 
allow the coefficients to lie in any commutative ring with identity. The other 
shortcoming is that the definition of variety depends on an embedding whose 
chief role is to get past the stage of making definitions; soon the embedding is 
stripped away, and the interest is in varieties up to isomorphism. The situation 
is similar to the historical treatment of groups and of manifolds. Groups were 
for the most part originally conceived in terms of group actions, but eventually 
the groups were separated from the actions. Manifolds at first were defined as 
certain subsets of Euclidean space, but eventually they were given an intrinsic 
definition. It would be more in keeping with the wisdom gained from other areas 
of mathematics if varieties could be defined intrinsically right away. 

Schemes, introduced and developed by A. Grothendieck in the late 1950s and 
early 1960s, accomplish both these objectives. The theory of schemes borrows 
ideas and techniques from many areas of mathematics, as will be apparent shortly. 
This section will briefly present some of the definitions, offer some examples, and 
show the sense in which varieties may be regarded as schemes.”° The interested 
reader may want to read more, and this section will therefore conclude with some 
bibliographical remarks. 


1. Spectrum. One preliminary remark is necessary. To isolate an affine 
variety from its ambient space A”, we can take advantage of Proposition 10.23, 
which says that the points of the variety correspond exactly to the maximal ideals 
of the affine coordinate ring.?” The set of maximal ideals in a ring, however, 
is usually not an object that lends itself to use with mappings. For example the 
canonical inclusion of Z into Q is not reflected in any of the mappings of the 
singleton set {(0)} of maximal ideals of Q into the set of maximal ideals of Z. 
Instead, the theory of schemes works with prime ideals. These behave nicely 
in that the inverse image of a prime ideal under a homomorphism of rings with 
identity is a prime ideal. 


©The material in this section is based in part on lectures by V. Schechtman given in 1991-92 
and in part on the books by Gunning, Hartshorne, and Shafarevich in the Selected References. 

?7Readers familiar with some functional analysis will recognize that a similar thing happens with 
compact Hausdorff spaces; by a theorem of M. Stone, the points of the space correspond exactly to 
the maximal ideals of the algebra of continuous complex-valued functions on the space. 
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Thus we work with the category of commutative rings with identity, the mo- 
tivating example being the affine coordinate ring of an affine variety over an 
algebraically closed field. If A is a ring in this category, the spectrum of A is the 
set Spec A of prime ideals of A. For example the spectrum of a field consists of 
the one element (0), that of a discrete valuation ring consists of 0 and the unique 
maximal ideal, that of a principal ideal domain consists of 0 and the principal 
ideals (f) such that f is an irreducible element, and that of CLX, Y] consists of 
the ideal (0), the maximal ideals corresponding to one-point sets in C?, and all 
prime ideals (f (X, Y)) of irreducible affine plane curves over C. 

The spectrum of A is understood to carry along with it two additional pieces 
of structure. The first piece of structure is an analog for Spec A of the Zariski 
topology.”® To each ideal a of A, we associate the subset V(a) C Spec A of all 
prime ideals p with a C p. The sets V(a) are easily seen to have the defining 
properties of the closed sets of a topology, and this topology will always be 
understood to be in place. It is immediate from the definition that V(a) = V(./a) 
for every ideal a. One checks for any prime ideal p that V(p) = {p}; consequently 
the one-point set {p} is closed if and only if p is a maximal ideal. 

At least when A is Noetherian, Spec A is a Noetherian space, and a notion 
of dimension (not necessarily finite) is defined for each closed set in the usual 
way” as in Section 2; for A itself this coincides with the Krull dimension of 
A. In this situation the irreducible closed sets are the sets V(p) with p prime. 
The fact that such a set is irreducible follows from the identity V(p) = {p}; the 
converse assertion follows from the identity V(a) = V(./a) and the Lasker— 
Noether Decomposition Theorem (Problem 14 at the end of Chapter VII). By 
Proposition 10.5 every closed set is a finite union of irreducible closed sets, and 
thus we have a complete description of the closed sets. For example, in a principal 
ideal domain the closed sets consist of the finite sets of nonzero prime ideals, as 
well as the set of all prime ideals. For the ring A = CLX, Y], every proper closed 
set of Spec A is a finite union of singleton sets {(X — xo, Y — yo)} and of sets 


(fA YYYU U {(X—x0,¥ — yo)} 
Ff (x0,¥0)=0 
with f (X, Y) irreducible. 
If g : A > B is a homomorphism in our category of rings (always assumed 
to carry | to 1) and if p is a prime ideal in B, then y~!(p) is a prime ideal in A. 
Thus the definition “y(p) = g~!(p) gives us a function “y : Spec B — Spec A. 
If E is a subset of A, then we readily check that 


(“g) '(V(E)) = Cg) (fp |p 2 E}) = {4 “e@) 2 FE} = V@E)), 


28 little care is needed with the definitions when A is the 0 ring, which has an identity but no 
prime ideals. Then Spec A is empty, but we will want to allow it as part of the theory. So we need 
to allow the empty set as a topological space. 

°The general theory treats dimension as defined even when A is not Noetherian, but it will be 
enough in this section to consider only the Noetherian case. 
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from which it follows that “g is continuous. The function “¢g can be fairly subtle. 
For example, if g is the inclusion of Z into the ring R of algebraic integers in 
a number field and if P is a nonzero prime ideal in R, then “g(P) = PN Zis 
the corresponding prime ideal (p) in Z; the continuity of “g implies that each 
nonzero prime ideal (p) of Z arises in this way from only finitely many ideals P 
in R. 


2. Structure sheaf. The second piece of additional structure carried by the 
spectrum of A is its “structure sheaf; which is a certain specific sheaf with 
base space Spec A. Sheaves were introduced by J. Leray in 1946 in connection 
with partial differential equations and by K. Oka and H. Cartan about 1950 in 
connection with the theory of several complex variables. As with vector bundles, 
sheaves may be viewed as having a base space carrying some topological infor- 
mation and fibers carrying some algebraic information; local sections will be of 
great interest. The initial example of a sheaf in several complex variables is the 
“sheaf of germs of holomorphic functions” on an open set in C”, germs being 
defined for holomorphic functions on an open set in the same way as they were 
defined in Section 4 for rational functions on a quasi-affine variety. 

We shall define two general notions, “sheaf” and “presheaf,;’ and compare 
them. The prototype of a presheaf in several complex variables is the collection 
of vector spaces of holomorphic functions on each nonempty open subset of the 
given open set; the prototype in classical algebraic geometry is the collection of 
regular functions on each nonempty open subset of a quasiprojective variety. In 
the general case, fix a category to describe the allowable structure on each fiber; 
common choices for the objects in this category are abelian groups, commutative 
rings with identity (called “rings” hereafter in this section), and unital R modules 
for some ring. In defining sheaves and presheaves, we shall write the definitions 
using abelian groups, since it is a simple matter to adjoin the additional structure 
when the fibers are rings or modules. 

Let X be a topological space. A presheaf of abelian groups on the base space 
X isacollection {O(U), pyy}, parametrized by the open subsets U of X and the 
open subsets V of U, such that each O(U) is an abelian group, O(@) is the 0 
group, each pyy : O(U) > OV(V) is a group homomorphism, each pyy is the 
identity, and pwy pvu = pwu Whenever W C V CU. Weare to think of O(U) 
as a space of sections of some kind over U and pyy as a restriction map carrying 
sections over U to sections over V. A sheaf of abelian groups on the base space 
X is a topological space O with a mapping m : O — X such that z is a local 
homeomorphism onto, z~!(P) is an abelian group foreach P € X,and the group 
operations on each 2~!(P) are continuous in the relative topology from O. We 
are to think of the elements of a sheaf as germs obtained starting from a presheaf. 
The individual fibers 7~!(P) of a sheaf are called stalks. One writes (X, O) for 
the sheaf, sometimes abbreviating the notation to O. 

It is possible to construct a presheaf from a sheaf, and vice versa. If we are 
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given a sheaf O, we define a section s of O over U to be a continuous function 
s :U — Osuch that 7 os = ly. If OU) denotes the abelian group of sections 
of O over U and if pyy is the restriction map for sections, then {O(U), pyy} is 
a presheaf. In the reverse direction if we start from a presheaf {O(U), pyy} and 
form the kind of direct limit of abelian groups at each point that is suggested by the 
passage to germs, then it is possible to topologize the disjoint union of the abelian 
groups of germs so as to produce a sheaf. Passing from a sheaf to a presheaf and 
then back to a sheaf reproduces the original sheaf. But passing from a presheaf 
to a sheaf and then back to a presheaf does not necessarily reproduce the original 
presheaf. A necessary and sufficient condition on the presheaf {O(U), pyy} for 
{OWU), pyy} to result from passing to a sheaf and then back to a presheaf is that 
the presheaf be complete in the sense that both the following conditions hold: 


(i) Whenever {Uj} is an open covering of an open subset U of X and 
€ is an element such that py, y = 0 forall j, then f =0. 
OW) i ] h th ;,U(f) = 0 for all j, th 0 
(ii) Whenever {U;} is an open covering of an open subset U of X and 
fj is given in O(U;) for each j in such a way that pu,nu,.u; (fj) = 
PU;AU Us ( fx) for all j and k, then there exists f € O(U) such that 
pu,.u(f) = fj for all j. 


The structure sheaf of the spectrum of A isa certain sheaf of rings (Spec A, O) 
with base space Spec A. Just as in the case of regular (= polynomial) functions on 
an affine variety, this sheaf will have the property that the ring of global sections 
is isomorphic to the original ring (cf. Corollary 10.25). We shall describe O by 
describing the presheaf. For each prime ideal p of A, let Ay be the localization of 
A at p,ie., the localization of A relative to the multiplicative system consisting 
of the set-theoretic complement of p. This kind of localization is always a local 
ring. The idea is to define a ring O(U) of regular functions for each open subset 
U of Spec A in such a way that the stalk O, at the point p ends up being Ay for 
each p. With affine varieties we were able to make the definition directly in terms 
of the function field of the variety, i.e., the field of fractions of A; both O(U) 
and the stalk Op(U) at each point P ended up being subrings of this function 
field. The complication for general A is that we do not have a convenient analog 
of the function field available in which all the localizations are subrings. Thus 
we proceed by imitating the messier equivalent definition of regular function 
given in Proposition 10.28. Namely, for U open in Spec A, let O(U) be the set of 
functions s from U into the product Ts cy Ap such that s(p) is in the p" factor Ay 
for each p and such that s is locally a quotient of members of A in the following 
sense: for each p in U, there is to be an open neighborhood V of p within U and 
there are to be elements a and f in A such that for each q in V, the element f is 
not in q and s(q) equals a/f in Ag. (Recall that any element of A not in q defines 
an element in the multiplicative system leading to Aq; f is to be such an element 
for each q in V.) The mappings pyy are taken as ordinary restriction mappings, 
and the result is a presheaf. This presheaf is complete, and the associated sheaf 
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is the structure sheaf (Spec A, ©). An affine scheme is any sheaf of rings that is 
isomorphic in a suitable sense to the structure sheaf of some ring. 


3. Scheme. To define “scheme” and the notion that a scheme is defined over 
some ring or some field, we need to back up and say a few more words about map- 
pings in connection with sheaves. A ringed space is a sheaf of rings, (Spec A, ©) 
being an example. Let (X, Ox) and (Y, Oy) be two ringed spaces, and let 
{ev«u} and {yy} be their respective systems of restriction maps. A morphism 
(o, W) : (X, Ox) = (Y, Oy) of ringed spaces consists of a continuous function 
o : X — Yandacollection y of homomorphisms wy : Oy(U) > Ox(a7!(U)) 
such that 


WV © Po-lv,o-1U = Pvu ° Wu 


whenever U and V are open subsets of Y with V C U. The collection y = {wy} 
yields homomorphisms of stalks wp : Oy.o(p) > Ox,p foreach P in X. 

One property of the definition is thatifg : A — Bisahomomorphism of rings, 
then there is an associated morphism (0, w) : (Spec B, Og) > (Spec A, Oa) of 
ringed spaces. The continuous map o : Spec B — Spec A is the map o = “@ 
given by “y(p) = gy '(p) for any prime ideal p of B. The mapping y on 
stalks carries Ospec A,o(p) = Ospec A,g-'(p) tO Ospec a,p and is what is induced 
on the stalk by composition with g. It is not quite true that every morphism 
(o, wv) : (Spec B,Og) — (Spec A, O,) of ringed spaces arises from a ring 
homomorphism. The homomorphism (o, yf) of ringed spaces resulting from the 
ring homomorphism ¢ has the property that w carries the maximal ideal Mz-1(,) 
of the stalk A,-i(y) into the maximal ideal My, of the stalk By. A morphism 
(o, y) of ringed spaces whose stalks are local rings is called a local morphism if 
it has this property. With this definition one can show that every local morphism 
of ringed spaces (0, ) : (Spec B, Og) — (Spec A, O,) arises from some ring 
homomorphism g : A — B. This result is to be compared with Corollary 10.40 
for affine varieties. 

An isomorphism of ringed spaces is automatically local if all the stalks are 
local rings. The reason is that an isomorphism of one local ring onto another 
carries the maximal ideal of the first onto the maximal ideal of the second. Thus 
the earlier definition of affine scheme as a ringed space that is isomorphic to 
some (Spec A, Q) concealed only the rather natural definition of isomorphism of 
ringed spaces, not the more subtle condition “local.” 

A morphism of affine schemes is a local morphism of the affine schemes as 
ringed spaces. Then the classes of all affine schemes and morphisms of affine 
schemes together form a category. A scheme is a ringed space (X, O) such that 
each point of X has an open neighborhood for which the restriction of the ringed 
space to that part of the base is isomorphic to an affine scheme. One can define 
a natural notion of morphism for schemes, and the classes of all schemes and 
morphisms of schemes together form a category. 
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4. Variety as a scheme. Let V be an affine variety over an algebraically 
closed field, and let A(V) be the affine coordinate ring. We have just seen 
how Spec A(V) has the natural structure of an affine scheme. Since Spec A(V) 
includes all prime ideals of A(V), not just the maximal ideals, the continuous 
inclusion V — Spec A(V) is not onto. However, there is a natural relationship 
between the two, and there is a natural relationship between their rings of regular 
functions. The reason is that morphisms of affine varieties correspond exactly (in 
contravariant fashion) to homomorphisms of the affine coordinate rings, which in 
turn correspond exactly to morphisms of affine schemes. From the point of view of 
categories, therefore, the categories of affine varieties and affine schemes match 
perfectly. This description blurs what happens to the underlying algebraically 
closed field of scalars, and one wants to be able to say that the categories of affine 
varieties over k and affine schemes over k match perfectly. Making this statement 
requires an additional construction, which will be sketched in the next subsection. 

This correspondence can be extended suitably from affine varieties to quasipro- 
jective varieties, and the interested reader can find details on page 30 of Volume 2 
of Shafarevich’s books. 


5. Scheme defined over a ring. If A is a ring and (X, Ox) is a scheme, 
then a morphism of schemes (o, Y) : X — Spec A defines a homomorphism 
A — Ox(U) of rings for each open subset U of X. Specifically Wspec 4 carries 
Ospec a(Spec A) = A into Ox (X), and hence py x ° Wspec 4 Carries A into Ox (U) 
if {pvy} is the system of restriction maps for (X, Ox). The result is that Ox 
becomes a sheaf of A algebras. 

Conversely if Ox is a sheaf of A algebras, then one can construct a morphism 
of schemes X —> Spec A. In this case one says that (X, Ox) is a scheme over 
A. Every sheaf of abelian groups is a sheaf of Z algebras, and thus every scheme 
is a scheme over Z. Schemes over Z are of special interest in number-theoretic 
situations, among others. The schemes produced from varieties in the previous 
subsection are schemes over the underlying field k. The notion of a scheme over a 
field that is not algebraically closed is one way of extending the theory of varieties 
to have it apply when the underlying field is not algebraically closed. 


6. Role of homological algebra. The sheaves of abelian groups over a fixed 
topological space X, with a natural definition of morphism, form a category, and 
one can define kernels and cokernels in this category. The result turns out to 
be an abelian category with enough injectives, and the homological algebra of 
Chapter IV is applicable. If (X, O) is a sheaf over X, then formation of global 
sections, given by (X, O) # O(X), is a covariant left exact functor. Since there 
are enough injectives in the category, the derived functors make sense, and the k* 
derived functor gives what is called the k" sheaf cohomology group H*(X, ©) 
with coefficients in O. This kind of cohomology is easy to use abstractly and hard 
to use concretely, but it can be shown to be isomorphic to other more concrete kinds 
of cohomology. In this way the cohomology of sheaves leads to generalizations 
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of Euler characteristics and Betti numbers that have significance in number theory 
and geometry. 

In applications, there tends to be a ringed space (X,‘R) (maybe a scheme) 
in the picture, and the sheaves (X, ©) often have the property that each stalk 
of O is a module for the corresponding stalk of #. Then the above kind of 
theory is applicable for sheaves that are ® modules in this sense, not merely 
sheaves of abelian groups. The interested reader can find details in Chapter II of 
Hartshorne’s book. 


BIBLIOGRAPHICAL REMARKS. The topic of schemes assumes knowledge of a 
certain core of algebraic geometry and commutative algebra, and it builds on more 
commutative algebra as it goes along. Some books mentioned in the Selected 
References that include algebraic geometry at the beginning level are those of 
Hartshorne (Chapter I), Harris, Reid, and Shafarevich (Volume 1). All these 
books have many geometric examples; this is particularly so for the book by 
Harris. Some books on commutative algebra are the ones by Atiyah—Macdonald, 
Eisenbud, Matsumura, and Zariski-Samuel. These lists are by no means exhaus- 
tive. There are in fact hundreds of books on the two subjects. To get a list of many 
of the ones in commutative algebra, one can search in the Library of Congress 
catalog at http: //catalog.1loc.gov, using the call number QA251.3; a few 
additional ones are sprinkled in among books with call number QA251. For 
books on algebraic geometry, one can search using the call number QA564. 

The book by Eisenbud—Harris on schemes is an introductory one written 
in a style that makes it comparatively easy for the reader to get an overview 
of the subject. Two older books on schemes are the ones by Macdonald and 
Mumford. Hartshorne’s book introduces schemes in Chapter II, and Volume 2 of 
Shafarevich’s books is on that topic. The end of Volume 2 of Shafarevich’s books 
contains a 20-page historical sketch of algebraic geometry, including discussion 
of some of the precursors of the subject of schemes. 
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In all problems, k is understood to be an algebraically closed field. 
1. If P isin P”, show that the ideal J(P) of members of k[Xo0, ..., X,] vanishing 
at all points (xo, ..., Xn) in qth {0} with [xo, ..., Xn] = P is homogeneous. 
2. Let X be a Noetherian topological space. 
(a) Prove that X is compact. 
(b) Prove that every irreducible closed subset of X is connected. 


3. (a) Prove that the image of a quasiprojective variety V under a regular function 
f :V = Al is connected. 
(b) Prove that if V is a projective variety and g : V — A” is a morphism, then 
g(V) is a one-point set. 
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4. Let U be the quasi-affine variety U = A? — {(0, 0)} in A?. Prove that O(U) = 
k[X, Y]. 

5. Deduce from the previous problem, Corollary 10.25, and Theorem 10.38 that UV 
is not isomorphic to an affine variety. 


6. Prove that a rational map of an irreducible curve into an irreducible curve is 
dominant or is constant. 


7. Let g: U — V bea dominant morphism between quasiprojective varieties. 
Prove that the induced mapping of local rings y% : Og(p)(V) > Op(U) given 
in Proposition 10.42 is one-one. 

8. Let V be the affine variety V = V(WX — YZ) in A‘, let A(V) be the affine 
coordinate ring k[W, X, Y, Z]/(WX — YZ), let X and Y be the images of X 
and Y in A(V), and let f = X/Y in the field of fractions of A(V). Prove that 
there exist no members a and b of A(V) with f = a/b and b(w, x, y,z) £0 
whenever wx = yz and one or both of w and y are nonzero. 

9. Let U and V be quasiprojective varieties, and let g : U — V be a function. 
Suppose that U and V are unions of nonempty open subsets U = J,.; Ua and 
V= lel ssog V, such that p(U,) C Vy for all w. Prove that g is a morphism if and 
only if each gy : Uy — Vy is a morphism. 

10. This problem concerns local extensions of regular functions from quasiprojective 
varieties to open sets in the ambient affine or projective space. 

(a) Let V be an affine variety in A”, let U be a nonempty open subset of V, let 
f be in O(U), and let P be a point in U. Prove that there exist an open 
neighborhood Up of U about P in V, an open set ‘oR in A”, and a function 
Fin OU) such that Up = VN Uo and such that F is an extension OF Flas 

(b) Extend the result of (a) to make it valid for any quasiprojective variety V in 
PP 

11. Suppose that X and Y are quasiprojective varieties, that U and V are irreducible 
closed subsets of X and Y, respectively, and that g : X — Y is amorphism such 
that p(U) C V. Prove that g : U > V isa morphism. 

12. Prove that 
(a) the mapping g : P"-! + P” givenby 9([xo, ..., Xn—1]) = [x0, --- Xn—1, 0] 

is an isomorphism of P”~! onto the projective hyperplane H,, corresponding 
to the homogeneous ideal (X,,) of k[Xo,..., Xn], 

(b) any projective variety V in P” that lies in H,, is isomorphic to a projective 
variety in P’“!, 

(c) any projective variety V in P” is isomorphic to a projective variety V’ in some 
P’ withr <n that is not contained in any projective hyperplane defined by 
a homogeneous ideal (X;) of k[Xo0,..., X;]. 


Problems 13-16 relate the classical condition for detecting a singularity in the affine 
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case to the corresponding condition in the projective case. The key is an identity 

traditionally known as Euler’s Theorem that is proved as Problem 3 at the end of 

Chapter VIII. In these problems it is assumed that F),..., F, are homogeneous 

polynomials ink[Xo0,..., X,],that P = [xo, ..., X,] 1s a point in P” in their common 

locus of zeros, and that P is in the image of A” under Bo, i.e., that x9 # 0. Define 
fis---> fy mn k[X1,..., Xn] by fi(X1,..., Xn) = Fi, X1,..., Xn). 

13. Define J(F)(xo,...,X/,) to be the r-by-(n + 1) matrix whose (i, 7)" entry is 
aH ones akg) forl <i <rand0O < j <n, and define J(f)(x},...,x/) to 
be the r-by-n matrix whose (i, a" entry is - oh x OX ...,47) for 1 <i <r and 
1 < j <n. Prove that rank J(F)(xo,..-, si = rank J(F)(Axo, ...,4x,) for 
all A € k*. 


14. With notation as in Problem 13, prove that the r-by-n matrix J (f(x), pe5iX)) 


equals the r-by-n matrix obtained by deleting the 0" column of the r-by-(n + 1) 
matrix J(F)(1,x},...,%/). 


15. Using Euler’s Theorem (Problem 3 at the end of Chapter VIII), prove concerning 
the point P on the locus of common zeros of F),..., F that the 0" column of 
the matrix J(F)(xo, ..., X,) is a linear combination of the other columns of the 
matrix. 


16. Deduce for the point P on the locus of common zeros of F),..., F, that 
rank J (F)(xo0, X1,---,%n) = rank J(f)(x1/Xx0, .--,Xn/X0). 


Problems 17-22 concern products of quasiprojective varieties. The Segre map- 
ping o : P” x BP" = PX with N = mn +m +n was defined in Section 8 by 
a ([xo, Beane. 25 Pa bi eee ynl) = [wo0,---, Wan] with w;; = x;y;. Let us abbreviate 
[woo, +++ Win] as [{wi;}] and kf Woo, .... Winn] as LW; }]. 


17. Prove that o is well defined and one-one. 


18. Every member [{wj;;}] of imageo has w;;wy = w,;w,; for all i, j,k,1. Prove 
conversely that every member [{w;;}] of PX with Wij We = Wj Wz; for alli, j,k, 1 
is in image o,, and deduce that image o = V(a), where a is the ideal in k[{W;,;}] 
generated by all Wi; Wa: — Wi Wi;- 


19. This problem will prove that a is a prime ideal, and in particular it will follow 
that V (a) is irreducible. Let g : k[{Wij}] > k[Xo,..., Xm, Yo, ..-, Yn] be the 
substitution homomorphism given by setting W;; = X;Y;. Then ker ¢ is an ideal 
containing a. 

(a) By introducing a suitable monomial ordering in k[{W;;}], show that any 
monomial in k[{ W;,;}] of total degree d is congruent modulo a to a monomial 
of total degree d of the form M = []; ij Wy ;; having the property that aj; > 0 
implies that a,; = 0 for all (k,/) with] > j andk > i. Call a monomial of 
this form reduced. 


20. 


21. 


22. 
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ij ro bij soe 
(b) Suppose that M = Ti. W;;’ and M’ = Ti. W,,’ are two distinct reduced 


monomials. By considering the first W;; for which a;; 4 b;;, prove that 


g(M) # p(M"). 
(c) Deduce that ker g = a, and show why it follows that a is prime. 


Let p be a prime ideal in k[Xo0,..., Xm], and let R = k[Xo0, ..., Xm]/p be the 


ij 


quotient. 
(a) Prove that the idealp k[Yo,..., Y,]ink[Xo,..., Xm, Yo, ..-, Yn] generated 
by all products of members of p and polynomials in Yo, ..., Y;, is prime. 


(b) By following the substitution homomorphism 
kl{Wij}] > kLXo,..., Xm, Yo. ---5 Ya] 


with a substitution homomorphism k[X0,..., Xm, Yo,---, Yn] > RIZ], 
prove that whenever U is a projective variety in P” and P is a point in P”, 
then o(U x {P}) is a projective variety in P’. 

Let U and V be projective varieties in P” and P”, respectively. Problem 20 

shows that o(U x {v}) is a projective variety in P’ for each v € V. Suppose 

that o(U x V) isaunion E| U E> of two closed sets in P%. 

(a) For i equal to | or 2, define V; = {v € V | o(U x {v}) g E;}. Why is 
ViNV2 = @? 

(b) Prove that V; and V2 are open by using bihomogeneous polynomials to 
exhibit each of V; and V2 as a neighborhood of each of its points. 

(c) Deduce from (b) that o(U x V) is a projective variety in P”. 

(d) Show how to deduce from (c) that if U and V are quasiprojective varieties in 
P” and P", respectively, then o(U x V) is a quasiprojective variety in P’ . 


(a) Prove that if U and V are quasiprojective varieties, then the projections of 
U x V to U and V are morphisms. Here the projection of U x V to U is 
understood to be the map o(u, v) + u of a(U x V) into U, and similarly 
for the projection to V. 

(b) Ifg:U > X andy : U — Y are morphisms, prove that (g, w) : U > 
X x Y when defined by (9, %)(u) = (gu), W(u)) is a morphism. 

(c) Ifg:U > Xandw: V — Y are morphisms, prove thatg x y:UxV > 
X x Y when defined by (g x w)(u, v) = (g(u), w(v)) is a morphism. 


Problems 23-25 make some observations about prime ideals and irreducible 
polynomials. 


23. 


Let J = (fi,..-., f-) be an ideal in k[X, Y] such that the zero locus V(/) is 
irreducible and such that f;,..., f, are irreducible polynomials. 

(a) Prove that J is prime if dim V(/) = 1. 

(b) Give an example to show that J need not be prime if dim V(/) = 0. 
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24. Fix a monomial ordering for k[X,,..., X,], and let J be a nonzero ideal in 
k[X1,..., Xn]. Prove that if J is prime, then the members of any minimal 
Grobner basis of J are irreducible polynomials. 

25. Suppose that char(k) 4 2. Within k[X, Y, Z], let E be the homogeneous sub- 
space k[X, Y, Z]2. The six monomials in F form ak basis of E and may be used 
to identify E with k°. Under this identification prove that the subset of reducible 
polynomials in E, including the 0 polynomial, is an affine hypersurface of k°. 


Problems 26-35 concern elliptic curves. An elliptic curve over k is a pair (E, O) 

consisting of a nonsingular irreducible projective curve E of genus | and a distin- 

guished point O. These problems use the Riemann—Roch Theorem and its associated 
notation in Chapter IX in order to exhibit a concrete realization of such a curve in 

P* with O on the line at infinity and with all other points of E in A*. Such a curve 

has a remarkable structure; for further information, including further applications of 

the Riemann—Roch Theorem to these curves, see the book by Silverman. Corollary 

10.56 identifies the points of F with the discrete valuations of the function field k(£) 

over E. Let vg be the discrete valuation corresponding to O. 

26. Forn > 0, prove that £(nvg) =n. Use this result to find members x and y of 
k(E) whose divisors satisfy (*)o = 2vg and (Y)oo = 3g. 

27. Prove that [k(E) : k(x)] = 2 and [k(Z) : kGQ)] = 3. 

28. Why does it follow from the previous problem that k(E) = k(x, y)? 

29. From the fact that £(6v9) = 6, deduce a nontrivial linear dependence over k 
among the members 1, x, y, x?, xy, 48 x3 of k(E). Show that the coefficients 
of y* and x? are necessarily nonzero, and then scale x and y appropriately to 
show that the image of the function g : E — {0} — P? defined by g(P) = 
[x(P), y(P), 1] is contained in the projective closure C of the zero locus of the 
polynomial f (X,Y) = (Y* +a,XY + a3Y) — (X* +.ay)X? +. a4X +6). 

30. Prove that f(X, Y) is irreducible and that C is therefore a projective curve. 

31. Why is g : E — {0} ~ Ca morphism? Why does it follow that g extends to a 
morphism ®: E > C? 

32. Deduce from Problem 28 that © is birational. 

33. Show that C is nonsingular at its point at infinity. 

34. Show that if C is singular at (xo, yo) in A*, then the member of k(£) given by 
Z=(y — yo)X - xo)7! has vo(z) = —l and vp(z) > 0 for all P in E — {O}. 


35. Deduce from Problems 33 and 34 that C is nonsingular, and explain why it 
follows that @ : E — Z is an isomorphism. 


HINTS FOR SOLUTIONS OF PROBLEMS 
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1. We are interested in odd p’s such that () = +1. Factor mas [] j pi . Then qua- 

7 é : ‘i m 3 \ ki Dj L(p-1 3-1) (P 
dratic reciprocity gives () =|]; (74) = Tr, odd (2) = TMi, oa(— 12?“ D@ (2). 
We consider p = | mod 4 and p = 3 mod 4 separately. For p = | mod 4, the set in 
question consists of those p’s for which (2) is —1 for aneven number of those k;’s that 


are odd. This is the union over all such systems of minus signs of the intersection over 
j of the finitely many arithmetic progressions for which the residue (2) equals the 
j sign. For a single system of minus signs, the result is an arithmetic progression of 
the form k J] k, odd Pj + b by the Chinese Remainder Theorem. Each of these contains 
a nonempty set of primes by Dirichlet’s Theorem, and hence P is nonempty. 

For p = 3 mod 4, if Th, aa(—1y2 er is +1, then the set in question is of the 
same form as above. If Th, oda (— 1)2i-D is —1, then the set in question consists of 
those p’s for which @) is —1 for an odd number of those k;’s that are odd, and this 
again is the finite union of arithmetic progressions. 

2. For (a), the proof of necessity of Theorem | .6b remains valid when the prime p 
is replaced by the integer m. For (b), the first paragraph of the proof of the sufficiency 
of Theorem 1|.6b handles matters if m is odd. 

3. For D = —56, H has order 4, but H’ has order 3 because 3x7 + 2xy + 5y? 
are improperly equivalent but not properly equivalent. A 3-element set has no group 
structure such that a 4-element group maps homomorphically onto it. 


4. For (a), the product of any two integers representable as ax* + bxy + cy? is 
representable by the class of the square, which is the class of the inverse because the 
class is assumed to have order 3. The class of the inverse is the class of (a, —b, c), 
and this represents the same integers as (a, b, c). 

For (b), we seek reduced triples. These are (a, b, c) with |b| < a < c and with 
b? — 4ac = D = —23, and we know that 3ac < |D| and that b has the same 
parity as D. Hence b is odd, and the inequalities 3b? < 3a? < 3ac < 23 show 
that |b| = 1. For |b| = 1, we have 1 — 4ac = —23 and ac = 6. Sincea < c, the 
possibilities with |b| = 1 are (1, +1, 6) and (2, +1, 3). Since (1, 1, 6) and (1, —1, 6) 
are properly equivalent by Proposition 1.7, |b| = 1 leads to just the three possibilities 
(1, 1,6), (2, 1,3), and (2, —1,3). Proposition 1.7 shows that these lie in distinct 
proper equivalence classes, and thus h(—23) = 3. 
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For (c), the general theory shows that (1, 1, 6) corresponds to the identity class, 
and therefore the other two reduced forms are in classes of order 3. 

For (d), we first track down what happens to the forms. If we write ~ for proper 
equivalence, then we have 


(2, 1, 3)(2, 1,3) ~ (2, 1,3)(3, -1, 2) ~ (2,5, 6)@, 5, 4) 
= (6,5, 2) ~ (2, —5, 6) ~ (2, —1, 3), 


and the last form is improperly equivalent to (2, 1, 3). The next step is to interpret this 
chain with actual variables. If the initial variables are x, y], x2, yo, then the change 
at the first step from (2, 1,3) to (3, —1, 2) comes from x2 = y5, yx = —x4 while 
leaving x, and y; unchanged as x; = x}, y1 = y,. The change at the second step 
from (2, 1, 3) to (2,5, 6) and from (3, —1, 2) to (3, 5, 4) comes from the translations 
xh =x{tyl.y, = yx = 25 + y5,¥5 = yy. The multiplication step comes from 
Proposition 1.9 and is given by x3 = xj/x5 —2y/y5 and y3 = 2x/y5+3xfy/+5y/y5. 
And so on. The final result is that 


(2x? + xiy1 + 3yp) (2x3 + x2y2 + 3y3) = 2X? + XY + 3Y?, 


where X = x1(—x2 + y2) + yi (%2 +2y2) and ¥Y = yj (x2 — yo) + x1 (x2 + Ya). 


5. The equality (ey °) (¢ -) (i a) = (a s) shows this. 


6. For reduced forms we seek (a, b,c) witha > 0,c > 0,|b| < a < c. We know 
that 3ac < |D| = 67, and D odd implies b odd. From 3b? < 3a? < 3ac < 67, we 
obtain 3b? < 67 and |b| < 4. So |b] is 1 or 3. For |b] = 1, 4(b?— D) = 4 (b? +67) = 
17; then 17 = ac, anda = 1 andc = 17. Since (1, 1, 17) is properly equivalent 
to (1, —1, 17) by Proposition 1.7, we obtain only one proper equivalence class from 
this pair. For |b] = 3, 3(b? — D) = £(9 + 67) = 19 forces ac = 19 and thena = 1 
and c = 19. Then |b| < a is not satisfied. So |b| = 3 gives no proper equivalence 
classes, and h(—67) = 1. 


7. The 6 cycles are 


(1,8, —15), (—15, 7,2), (2,7, —15), (—15, 8, 1); 
(—1, 8, 15), (15,7, —2), (—2,7, 15), (15, 8, —1); 
(3,8, —5), (—5, 7,6), (6,5, —9), (—9, 4,7), (7,3, —10), (—10, 7, 3); 
(—3, 8,5), (5,7, —6), (—6,5,9), (9,4, -7), (—7,3, 10), (10, 7, —3); 
(5,8, —3), (—3,7, 10), (10,3, —7), (—7,4,9), (9,5, —6), (—6, 7,5); 
(—5, 8,3), @,7, —10), (—10,3,7), (7,4, —9), (—9,5,6), (6,7, —5). 


8. The form (1, 1, 12) corresponds to the identity class, the classes of (2, +1, 6) are 
inverses of one another, and the classes of (3, +1, 4) are inverses of one another. The 
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group structure has to be cyclic, and any element other than the identity can be taken as 
a generator. Let us take a to be the class of (2, 1, 6). We are to identify a’. The form 
(2, 1, 6) is aligned with itself (having the same b component), it has j = 6/2 = 3, 
and the composition formula of Proposition 1.9 leads to (2- 2,1, 7) = (4, 1,3). 
This is properly equivalent to (3, —1, 4), and we do not have to follow through the 
algorithm of Theorem 1.6a to identify the product in our list. The result is that 
a <= (2,1,6),a2 © 3,-1,4), a = (a2)7! © G,1,4,a* =a7! © (2, -1,6), 
and a> = 1 © (1, 1, 12). 

10. For (a), the result is known for n prime by Theorem 1.2. By induction and 
the definition of the Jacobi symbol, it is enough to handle n = ab when a and 
b can be handled. We have $(n — 1) = 3(ab— 1) = 5b(a — 1) + 5(b- 1) 
= 5(a -—1)+ 5(b — 1) mod 2, the last step following because b is odd. Therefore 
(-1)2@-) = (-1)7@)436-) — (=1)(51) = (), the last step following by 
Problem 9a. 

For (b), we argue similarly, and the key computation is g(n? -)D= gab? -lH= 
gb? (a> — 1) + 3(b* — 1) = #(a? — 1) + 3(6? — 1) mod 2, the last step following 
because b? is odd. 


11. Allowing primes to appear more than once, write factorizations of m and n as 
m = JJ; pi andn = []j_; qj. Then Theorem 1.2 gives (“) = Tar Miz (4) = 
s "lil 
jar Wiz (Z(H 2@-Ya@-Y = (2) (—1) Deiat FDI) » Since 


J 
i Vie 31 — D3@ — YD = [Lh 3@—- DIPS 4@ - 0] 


and since >\_, 5(qj — 1) = 3(n— 1) mod 2 and YYy_; 3(pi— 1) = 30m—1) mod 2 
by the same argument as in Problem 10a, the required formula follows. 


12. For (a), choose by Dirichlet’s Theorem a sufficiently large prime p that is 
= 3 mod 8 and is in particular = 3 mod 4. If 8 divides |G|, then the fact that |G| 
divides p + 1 implies that 8 divides p + 1. So p = —1 mod 8. Since p was chosen 
with p = 3 mod 8, this is a contradiction. So 8 cannot divide |G]. 

For (b), choose by Dirichlet’s Theorem a sufficiently large prime p that is = 
7 mod 12 and is in particular = 3 mod 4. If 3 divides |G], then 3 divides p+ 1. Thus 
p = —1 mod 3. Since also p = 3 mod 4, p = 11 mod 12. But p was chosen with 
Pp =7 mod 12. This is a contradiction, and 3 cannot divide |G|. 

For (c) with an odd prime g > 3 given, choose by Dirichlet’s Theorem a sufficiently 
large prime p that is = 3 mod 4g and is in particular = 3 mod 4. If q divides |G|, 
then q divides p+ 1, and p+ 1 =0 mod q. Meanwhile, p = 3 mod 4q implies that 
p+1=4 mod 4q and p + 1 = 4 mod q, contradiction. So g cannot divide |G|. 

13. For (a), choose by Dirichlet’s Theorem a sufficiently large prime p that is 


= 5 mod 12 and is in particular = 2 mod 3 and = | mod 4. If 4 divides |G|, then 4 
divides p + 1, which is = 2 mod 4. So 4 cannot divide |G]. 
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For (b), choose by Dirichlet’s Theorem a sufficiently large prime p that is = 
2 mod 9 and is in particular = 2 mod 3. If 9 divides |G|, then 9 divides p + 1, which 
is = 3 mod 9. So 9 cannot divide |G|. 

For (c) with an odd prime g > 3 given, choose by Dirichlet’s Theorem a sufficiently 
large prime p that is = 2 mod 3g and is in particular = 2 mod 3. If g divides |G], 
then g divides p + 1, which is = 3 mod 3g and hence is = 3 mod gq. So q cannot 
divide |G]. 

14. The integers in (a, r) are exactly the multiples of a, since such an integer n has 
to be of the form n = ca + dr for integers c and d. This equation says that n = ca 
and 0 = dr, since | andr are linearly independent over Q. The integer N(s) = so(s) 
is in J because s is in J and o(s) is in R, and thus N(s) has to be a multiple of a. 


15. Write J = (a,r) witha > 0 an integer andr in J by Lemma 1.19b. As in the 
previous problem, the integer a is characterized uniquely in terms of J as the least 
positive integer in 7. Putr = b + gé for suitable integers b and g. Without loss of 
generality, we may assume that g > 0. Using the division algorithm and possibly 
replacing b by b — na for some integer n, we may assume that 0 < b <a. 

With these conventions in place, let us see that g necessarily divides a. The fact 
that ad has to be in J means that ad has an expansion ad = ca + c2(b + gd) with 
integer coefficients. Then ad = c2g6, and g must divide a. 

In particular,0 < g <a is forced. To see that b and g are uniquely determined, 
let {a, b’ + g’5} be another such Z basis. Since b’+ g’5 = cya+c2(b+ gé) and since 
symmetrically we have b + g6 = ca + c4(b’ + g'5), we obtain g’ = cog = cochg". 
Therefore |c2| = 1. Meanwhile, we must have 


catob=0D' and c2¢6 = 9'6. 


The second of these equations shows that cp > 0. Thus c7 = 1. Finally cja = b’—b 
withO <b <aand0 <b’ <a forces b'—b = 0. Therefore a,b, and g are uniquely 
determined. 

To complete the proof, we need to see that g divides b and that ag divides N (b+ 85). 
Since aé is in J, ad = cia + c5(b + gd). Hence chg = a and cja+cyb = 0. 
Substituting the first of these equations into the second gives c/c}g +cjb = 0. Since 
c; #0 from the equality cg = a,c/g + b =0. Thus g divides b. 

To see that ag divides N(b + gd), we use the fact that go (5)(b + g5) is in J to 
write bgo (5) + 60 (8)g* = diag + dog(b + g5) for some integers d; and d2. Then 
N(b + g6) = b* + bg(6 +.0(6)) + 60 (5)g? = b? + bgd + diag + dog(b + gd). 
Equating coefficients of 5 and 1 gives 


O=bg+dg? and N(b+ 5) =b* +dag+dybg. 


Since g > 0, the first of these equations gives d) = —bg~!. Substituting into the 
second equation gives 


N(b + gd) =b? + diag — (bg! )bg = diag, 
and we see that ag divides N(b + gé). 
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16. We are to show that Za + Z(b + g6) is closed under multiplication by arbitrary 
members of R. It is enough to treat multiplication by | and by 6. There is no problem 
for 1. Since 6 +0 (6) is in Z, it is enough to show that there exist integers c,, cz, d,, dz 
with 


6a =cja+c2(b+ gd) and 0 (5)(b+ gd) = dja + do(b+ gs). 
In view of the assumed divisibility, we can put cz = ag! r= —bg"! 5d, = —bg"! : 
and d; = N(b + g6)(ag)~!. Then the first equation is certainly satisfied, and the 
question concerning the second equation, once we have multiplied it by g, is whether 
we have an equality 


go(5)(b + g6) = N(b + 25) — b? — begs. 


The left side is N(b + gd) — b(b + gé), and thus equality indeed holds. 


17. From Section 7 the relevant formula is N(I) = |VD|~!|rio(r2) — o(r)ra]. 
Here we can take r; = a and rz = c + do. Substitution gives 


N(D) =|VDIJ"Jallo(c +. d8) — (c +. d3)| 
=|VD| "alle + do(5) — ¢ — dé] = |VD I Jad|Io (5) — 5}. 


The expression |VD |~!|o (8) — 6| arose in Section 7 in the computation of N(R) and 
was shown to be 1. Thus N(J) = |ad|. 


18. For (a), the algorithm of Section IV.9 of Basic Algebra shows how to align 
matters so as to compute the quotient of a free abelian group by a subgroup when 
the subgroup is given by generators. The given relationship between the generators 
a and b + gé of Problem 15 with the Z basis of R is 


(sts) = (45) G) 

b+gi5} ~~ \beg ra ee 

The procedure is to do row and column operations on the coefficient matrix to bring 
it into diagonal form. Since g divides b, a column operation replaces the b by 0. 
We obtain a diagonal matrix with diagonal entries a and g, and the quotient group 
is identified as (Z/aZ) @ (Z/gZ). Thus ag is identified as the number of elements 
in the quotient group R/J. Problem 17 identified ag as N(/), and thus N (J) is the 
number of elements in R/TJ. 

For (b), the inclusion J C J induces a quotient mapping of the finite group R/J 
onto R/J. As ahomomorphic image of R/I, R/J must have an order that divides 
the order of R/J. In view of (a), N(J) divides N(J). The equality J = J holds if 
and only if the quotient mapping is one-one, and this happens, because of the finite 
cardinalities, if and only if N(J) = N(J). 
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19. The relevant arguments for the first three parts of this problem already appear 
in Chapters VIII and IX of Basic Algebra, and thus we can be brief. For (a), the 
Chinese Remainder Theorem (Theorem 8.27 of Basic Algebra) shows that R/IJ = 
R/I x R/J, and then NU J) = NCU)N(J) by Problem 18a. For (b), the in- 
ductive argument for (**) in the proof of Theorem 9.60 of Basic Algebra shows 
that dimz/pz R/P° = ef, and thus |R/P*| = p’. For (c), Corollary 8.63 of 


Basic Algebra and Problem 18a above together show that NU) = TTj-1 N (Pe ) 


iff = j= Pe is the unique factorization of the ideal 7. Since N (PP) = N(P,)Ki 
by (b), NU) = TTj=1 N(P,)k , and (c) follows immediately. 
For (d), we use Problem 15 to write J = (a, b + gd); then 


Io (1) = (a’, a(b + 96), a(b + ga (8)), N(b + g8)). 


Each of the generators on the right side lies in the principal ideal (ag). In fact, a? is in 
(ag) because g divides a,a(b + g6d) and a(b + go(4)) are in (ag) because g divides 
b,and N(b+ g6) is in (ag) because ag divides N(b + gd). Therefore Ja (J) C (ag). 
Since N(J) = ag by Problem 17, Problem 19c shows that NUo(1)) = N((ag)). 
Then Jo (J) = (ag) = (N(J)) by Problem 1 8b. 

20. The only ideal J with N(J) = 1 is J = R. Problem 19c therefore shows that a 
nontrivial factorization of (p)R leads to a nontrivial factorization of its norm, which 
is p*. This factorization must be p* = p - p, and thus / factors nontrivially at most 
into two factors, each with norm p. 


21. For (a), we use Problem 15 to write a nontrivial factor J of (2)R as J = 
(a,b + gd). Problem 17 shows that 2 = N(/) = ag with g dividing a. Therefore 
a = 2 and g = 1. So the only possible factors are of the form J = (2, b + 4) with 
0<b<az=2. Thus b = 0Oorb = 1. When D is odd, we have Tr(6) = | and 
N(5) = (1 —m). Then N(b + 5) = b? +b Tr(5) + N(5) = b? +b4+ GL —m) = 
x1 —m) mod 2. If m = 5 mod 8, then we see that 2 does not divide N (b + 4), and 
thus (2) R cannot have a nontrivial factor. 

For (b), we again have N(b + 5) = b? + bTr(8) + N(5) = b? +b+4(1—m) = 
x1 — m) mod 2, and the condition m = | mod 8 makes the right side 0. Thus 2 
divides N(b+4), and (2, 6) and (2, 1+ 6) are both ideals by Problem 16. The product 
of these ideals is (2, 6)(2, 1 +6) = (4, 26, 2(1 4+ 4), 6”) and contains (2)R because 
2 = 2(1 + 5) — 25. Moreover, the product has norm 4 by Problems 17 and 19c, and 
this matches the norm of (2) R. Thus Problem 18b shows that (2, 5) (2, 1+6) = Q)R. 

For (c) and (d), 6 = —./m. Thus N(b + 8) = b* + bTr(6) + N(S) = b* —m= 

— 4D. If D/4 = 3 mod 4, then b — 4D is divisible by 2 for b = 1. If D/4 = 
2 mod 4, then b — 1D is divisible by 2 for b = 0. With b taking on the appropriate 
value in the two cases, (2, b + 6) is an ideal by Problem 16. The square of this ideal 
is (4, 2(b + 8), (b — /m)°) = (4, 2(b + 8), b*> +m — 2mV/b). The definition of b 
makes b” + m even in every case, and hence (2, b + 5)* D (2)R. Since the norms of 
the ideals on the two sides are both 4, the two ideals must be equal. 
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22. Arguing as in the previous problem, we see that any nontrivial factor of (p)R 
must have norm p and therefore must be given by (p, x + 5) for some x such that p 
divides N(x + 5) = x* +x Tr(5) + N(8). 

For (a), Tr(5) = 1 and N(5) = {(1 —m) = 4(1 — D), and the condition is that 
p divide x? + x + 4(1 — D). This means that x7 + x + {(1 — D) = 0 mod p 
is to have a solution. When this happens, Problem 16 ensures that (p,x + 4) is 
an ideal. Then (p,x + 0(6)) is an ideal as well, and the product of the two is 
(p?, P(x +54), p(x +a(6)), N(x +4)). Since p divides N(x + 4), this product ideal 
is contained in (p)R. The product ideal and (p)R both have norm p’, and therefore 
they are equal. 

For (b), Tr(6) = 0 and N(6) = —m = —D/4, and the condition is that p divide 
x* — D/4. This means that x? — D/4 = 0 mod p is to have a solution. When this 
happens, Problem 16 ensures that (p, x + 4) is an ideal. Then (p, x + o(6)) is an 
ideal as well, and the product of the two is (p?, p(x + 8), p(x +.0(8)), N(x + 6)). 
Since p divides N (x + 5), this product ideal is contained in (p)R. The product ideal 
and (p)R both have norm p?, and therefore they are equal. 

For (c), the respective conditions for factorization in (a) and (b) are that 
ee ee Ae — D) =0 mod p and x* — D/4 = 0 mod p be solvable. In both cases 
the quadratic expression on the left side has discriminant D. Hence factorization 
occurs if and only if D is a square modulo p. 


23. In both cases we are assuming that (p)R has a factor J = (p,x + 5) with 
0 < x < p. Using Problem 15, let us write oJ) = (p,x +0(6)) = (p,y + 4) 
with 0 < y < p. Choose integers c and d with x + 0 (5) = cp+d(y + 4). Since 
o (6) = Tr(6) — 46, the equation is x + Tr(6) — 6 = cp + dy + dé, and we obtain 
x + Tr(6) = cp + dy and —5 = dé. Thus d = —1, x + Tr(d) = cp — y, and 
cp =x+y+Tr(6). From 0 < x < pand0O < y < p, we haveO0 <x +y+Tr(6) < 
2(p — 1)+ Tr(6) < 2p — 1. Soc in the equation cp = x + y + Tr(6) has to be 1 
or O, and the equation is x + y = p — Tr(6) or x + y = —Tr(6). The condition that 
o (I) = J is the condition that x = y, hence that 2x = p — Tr(6) or 2x = —Tr(6). 
When D is odd, this says that x = 5(p — 1); when D is even, it says that x = 0. 

24. Since o ((p,x +45)) = (p,x +0 (6)), the two factors are the same if and only 
if o() = I. Problem 23 says that the latter equality holds for D odd if and only if 
i= 5(p — 1) and that it holds for D even if and only if x = 0. In the two cases we 
know from Problem 14 that p divides N(x + 6) = x? + x Tr(S) + N(6). 

When D is odd, this result says that p divides x7 + x + Ae — D), hence that it 
divides 4x? + 4x + (1 — D) = (2x + 1)? — D. Then p divides D if and only if p 
divides 2x + 1, if and only if x = 5(p —1). 

When D is even, we know from Problem 14 that p divides x? — m. Hence p 
divides 4(x2 — m) = 4x — D = (2x)* — D. Then p divides D if and only if p 
divides 2x, if and only if x = 0. 

25. Theorem 1.14 shows that the genus group G is the quotient of the abelian group 
H modulo its subgroup of squares. The subgroup of squares consists of the elements 
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in the product of the cyclic subgroups of orders 2'~!, ..., 2&1, qi! ,---,q!,and the 
quotient is the product of r copies of a cyclic group of order 2. Thus G has order 2’. 
The subgroup of elements of H whose order divides 2 is the product of the 2-element 
subgroups of the cyclic groups of orders 2"',..., 2. It is a product of r copies of a 
cyclic group of order 2 and hence is abstractly isomorphic to G. 


26. If P is a nonzero prime ideal, then so is o(P). Since o? = 1, the mapping 
P + o(P) is a permutation of order 2 on the nonzero prime ideals. Evidently the 
prime ideals of type (i) above are permuted in 2-cycles, and the prime ideals of types 
(ii) and (iii) are left fixed. 

If a nonzero ideal J has prime factorization J = [], px , then o (I) =|], 0(P))*. 
When o (J) = J, we can match the factors and their exponents. We conclude that the 
factorization of J is as 


; kj k; 
CT PEC TE Pe) iil Be). 
pairs (P;,0(P;)) ideals P; ideals P; 
of type (i) of type (ii) of type (iii) 


Each factor in the first product is of the form (NV (P,))ki by Problem 19d, each factor 
in the second product is of the form (p)" for some prime p not dividing D, and each 
pe contributing to the third factor is of the form (p) for some prime p dividing D. 
The result follows. 


27. For (a), the only nontrivial step in the displayed formula is the third equality, 
which follows because xo (x) = N(x) = | by hypothesis. If we take y = (1+x)7!, 
then the displayed formula gives x = (1+x)(+ a(x) l= y—!a(y) as required. 

For (b), the equality o(y)y~! = x remains valid when y is replaced by ny with 
n € Z, and thus we may take y to be in R. Now let y and z be in R with o(z)z~! = 
x = o(y)y—!. Then o(zy~!) = zy!, and zy! is in Q. Among all y € R with 
oO (y)y7! = x, let yo be one with | NV (y)| as small as possible; yo exists because |N(y)| 
is an integer in each case. Ifo(z)z7! = x,writez = u+v6, yo = a+bé,and rane = 
P/q with GCD(p, g) = 1. Then gu+ qué = qz = pyo = pa+ pbé, and we obtain 
qu = paand qv = pb. Therefore g divides a and b, and q~!yo = q~!a + q7'b6 is 
in R. Then y = q~'yo is another element in R with o(y)y—! = x, and it contradicts 
the minimal choice of |N (yo)| unless |g| = 1. We conclude that z = +pyo. 

28. In(a), N(I7) = N((x)) says that Nd)? = |N(x)|N(R) = |N(x)|. Therefore 
N(x-!N(1)) = |N(x)|-INC(N()) = |N(x)|-' NC)? = 1, and xN(1)~! has 
norm 1. 

In (b), Problem 27b gives us yo € R with a(y0)Yo" = xN(I)~!. Then we 
compute that o((yo)J) = o(90)0 1) = yoxN()"!o(1) = yoNU)!@)oW) = 
yoN(1)~'1a(1) = yoNU)'(N(D) I = yol. 

For (c), suppose N(yo) > 0. Then Problem 26a shows that (yo)/ = (a)Js for 
some a € Z, and this gives the required strict equivalence. If N(yo) < 0, then 
N(yo/m ) > 0, and o ((yo./m )I) = (yo./m )I; Problem 26a shows that (yo./m )I 
= (a) Js for some a € Z, and this gives the required strict equivalence. 
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29. For (a), since m < O and m is neither —1 nor —3, the possible units are 


€ = +1. The equality o(x) = ex says that x is in Z if e = +1, and it says that x is 
in Z./m if e = —1. 
For (b), when m = —1 or m = —3, we have D = —4 or D = —3; thus 


g = 0, and there is nothing to prove. For other values of m < 0, consider Js. Then 
N(Js) = ITpes p, and this is some divisor D’ of D with no repeated factors. Let us 
write Js = (a,b + gd) by Problem 15. Then ag = D’ and g divides a. Since D’ is 
square free,a = D’ and g = 1. If Js is principal, then (a) shows that Js = (c) for an 
integer c or Js = (d./m ) for an integer d. 

Suppose Js = (c). Thenb+6 = rc for somer € R. Writer = x + y6 for 
integers x and y. Then b +6 = cx +cyé shows that 1 = cy and hence that c divides 
1. Thus Js = R, and the set S is empty. 

Suppose Js = (d./m ). Thenb+6 = dx./m+dyé./m for some integers x and y. 
If D is odd, then the equation reads b + 5(1 — f/m) = dx./m+ dy5(1 —J/m)./m. 
This implies that —5./m = d(x + }dy)./m, hence that —1 = d(2x + 1). Therefore 
d=1,Js = (/m) = (/D), N(Js) = |D|, and S = E. If D is even, then the 
equation reads b — ./m = dx./m — dym, and we obtain —1 = dx. Sod = 1, 
Js = (/m), N(Js) = m = D/4 = D’. This is the product of all prime divisors of 
D if D/4 = 2 mod 4 and all of them but 2 if D/4 = 3 mod 4. 

For (c), let E’ be a subset of g members of F, and assume that the element of E 
that is not in E’ is not 2 unless D = —4. If S and S” are two subsets of E’, then 
Js Js: = (n)Jr, where n = TTpesns’ pand T = (S — S’) U(S’ — S). If Js and Js 
represent the same genera, then Js Jy is principal, and J; must be principal. The set 
T can be empty only if § = S’, and it has to be a subset of E’ and thus cannot be all of 
E. According to (b), the only way that Jr can be principal is thus that § = S’ or that 
all of the conditions D even, D/4 = 2 mod 4, and T = E’ = E — {2} are satisfied. 
In the latter case the construction of E’ shows that D = —4, T is empty, and S = S’. 
Thus the ideals J; for S C E’ represent distinct genera in every case. 


For (d), the roots of unity are tef. oe N(e€1) = —1, the roots of unity of norm 1 
are the te?" So suppose that ¢ = eet Put €9 = 7. ae €090 (€9) = N(é0) = 
(—1)", and o(e{x) = o(€0)a (x) = o(en)ex = (-1)"eq ley = 4+(-1)"e, ie = 
#£(—1)"efx = se} x with s = +(—1)". Ifs = +1, then e/x is in Z, while fee =-l, 


then ¢/x is in Z./m. Then the same steps as in (b) and (c) finish the argument. 

For (e), the four mentioned ideals are principal, and we have (1) = Js for S 
empty and (,/m ) = Js for S equal to the set of pate divisors of m. For these two 
ideals, N(1) > O and N(./m) < 0. Consider (yg) and (yg ). The ideal (yg) has 
o((yg Ye = (o(g )= (yg Eev= (yo ), and hence it is a the form (n)Js5 for some S. 
Then i= = nr for some r € R, and it follows that n~ yg is in R. This contradicts 
the minimality of IN (yg )| unless |n| = 1. Hence (yg ) = Js for some S. Similarly 
(yo ) = Js for some S. Thus all four principal ideals are of the form Js. 

Let us see that the four principal ideals are distinct. Neither ideal (yo ) nor (yo ) 
can equal (1). In fact, if (vo ) were to equal (1), then 6. would be a unit €, and we 
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would have ¢; = a(yg Og)! =oa(e)e! =e? 
is fundamental. Similarly (yg ) cannot equal (1). 

Since o (yf J/m (yg /m)"! = -o (yg) J/m Og) (Vm)! = -0 Op )Os)7! 
= —€1, the definition of yg shows that yj ./m = ny, for some integer n. Passing 
to norms gives —mN(yo) -_ nN (yo): Therefore N(yg) and N (yo ) have opposite 
sign. 

We have seen that two of the four elements 1, Yo Yo» have positive norm, 
two have negative norm, and the two of positive norm generate distinct principal 
ideals. To see that the two of negative norm generate distinct ideals, we consider 
separately the cases N (yg ) < 0 and N(yg) < 0. If N(yQ ) < 0, we use the equation 
—mN (yo) = nN (yo) proved in the previous paragraph. If (y)) = (,/m), then 
cancellation gives N (yg ) = +1; then Yo is a unit, and we have seen that it cannot 
be. If N (yg ) < 0, we use the definition of Yo in the same way as in the previous 
paragraph to obtain —mN (yo ) = n>N (yg ) for some integer n. Cancellation shows 
that N(y9 ) = +1; then yp is a unit, and we have seen that it cannot be. Thus the 
four principal ideals are distinct. 

Now suppose that (x) is any principal ideal fixed by o. As in the statement of the 
problem, we have o(x) = ex for some unit ¢. The most general unit is of the form 


, in contradiction to the fact that €1 


€ = te}. We shall produce constructively the element of Problem 27 corresponding 
to é. Put yon = en? if n is even and yo, = ella if n is odd. For n even we 
have 

0 (90.nx) = O (Yonex = +o (67! )etx = +e)" etx = +yonX, 


and for n odd we have 


+1)/2 —(n+1)/2 
O(yonX) = O(yon)ex = +o(e\" y yoyetx = te; aie a(yo)etx 
—1)/2 —1)/2 
= +e" yf o(yo)x = +e" ue Yoeltx = LO nr. 


Thus o(yo..x) = +yo.nx for all n. Therefore yo,,x is in Z or in Z./m, depending 
on the sign +. Depending on the sign, |N (yo.nx)| = |NGOo,n)||N()| thus is either 
the square of an integer or m times the square of an integer. If n is even, then 
|N(yo0.n)| = 1, and |N(x)| is therefore either the square of an integer or m times the 
square of an integer. Since | (x)| is the value of the norm of (x), there are only two 
possible S’s for which this can happen. If n is odd, then |N (yo,n)| = a for a certain 
square-free integer > 1, as we have seen. Therefore | N (x)| has to be either a~! times 
the square of an integer or ma! times the square of an integer. So there are only two 
possible S’s in this case. Thus there are only four possible S’s in all cases, and these 
have been accounted for. So the number of principal ideals among the Jys’s is exactly 
four. To complete the proof, we now argue as in (c) but consider only possibilities 
for which the product of two Js’s is n” times one of the two Js’s given by a principal 
ideal with a generator of positive norm. 


30. Since D is fundamental, (a1, b, c,) is automatically primitive. Then Lemma 
1.10 produces a properly equivalent form that represents some integer a relatively 
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prime to D. The rest follows from the argument in the second paragraph of the proof 
of sufficiency in Theorem 6b. 


31. For (a), choose an integer r such that b+ 2ar = kD for some integer k; this is 
possible because GCD(D, 2a) = 1. Then the translation x = x’+ry’, y = y’ leads 
from ax?+bxy+cy* toax’*+kDx'y'+c'y'? for some c’. The discriminant of the new 
form is still D = k?D? — 4ac’, and thus 4ac’ = 0 mod D. Since GCD(4a, D) = 1, 
c’ =0 mod D. 

For (b), b has to be even because D = b? — 4ac is even. Write b = 2b. Choose an 
integer s such that b+-as = kD for some k; this is possible because GCD(a, D) = 1. 
Then the translation x = x’ + sy’, y = y’ leads from ax* + bxy + cy? to 
ax’? + 2kDx'y' + c'y for some c’. The discriminant of the new form is D = 
4k? D* — 4ac’, where c! = (4a)~! D(4k?D — 1) = a~!(D/4)(4k? D — 1). Modulo 
D, this expression is —a(D/4), where a is an integer with aa = 1 mod D. Here 
a is odd, and hence a” = 1 mod 8. If 2" is the exact power of 2 dividing D, 
then aa = 1 mod 2", and hence a = a mod 2". If p is any odd prime dividing 
D, then p divides D/4, and hence a(D/4) = 0 = a(D/4) mod p. Therefore 
a(D/4) = a(D/4) mod D, and we conclude that c’ = —a(D/4) mod D. 

32. For (a), clearing fractions in the expression ax* + kDxy +1Dy* =r yields 
au? +kDuv+lDv* =rw?. Suppose a prime p divides GCD(w, D). Then p divides 
au”. Since GCD(a, D) = 1, p divides u. Referring back to the equation, we see that 
p’ divides au” and k Duv, hence divides / Dv’. Thus p divides /v*. The discriminant 
is D = k* D* — 4alD, and divisibility of 1 by p would force p? to divide the left side 
D. Hence p does not divide /, and p must divide v. Then p divides both u and v, 
in contradiction to the minimality of the common denominator w. We conclude that 
GCD(w, D) = 1. Taking the equation au? +kDuv +1Dv* = rw? modulo D gives 
au? =rw* mod D. Since r and w are relatively prime to D, so is u. Thus we can 
rewrite this congruence as a = d’r mod D for some integer d relatively prime to D. 

For (b), the same argument gives a’ = d’’r mod D. Since d is relatively prime to 
D, we can rewrite the congruence for a asr = d~*a mod D, and then a’ = d?r= 
(d—!d')?a mod D. 

For (c), the given forms are properly equivalent over Z to (a,kD,1D) and to 
(a’, k’D, I'D), respectively, by Problem 31a. Proper equivalence over Q means that 
the two forms take on the same rational values, one of which is the integer a’. Part 
(b) therefore shows that a’ = as* + nD for some integers s and n, necessarily with 
GCD(s, D) = 1. Modulo D, the forms are given by ax* and a’x’*, and the first 
can be transformed into the second by the substitution x = sx’, y = sw! y’, where 
s~! is the multiplicative inverse of s in Z/DZ. In fact, substitution into ax’ gives 
a(sx’)? = (as?)x/2 = a'x'2 
in SLQ, Z/DZ). 


33. Part (a) is almost the same as Problem 32a. Clearing fractions leads to 
au? + kDuv + (ID — a(D/4))v* = rw’, and the argument that no odd prime p 
divides GCD(w, D) is the same. Suppose that 2 divides w. The equation modulo 4 


mod D. This substitution is given by the matrix (3 ies ) 
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is then au? — a(D/4)v? = 0 mod 4 with D/4 congruent to 2 or 3 modulo 4. Since 
2 divides w, at least one of u and v must be odd. If D/4 = 3 mod 4, the congruence 
becomes a(u2 + v’) = 0 mod 4, which is impossible with at least one of u and v 
odd. If D/4 = 2 mod 4, the congruence becomes a(u* + 2v*) = 0 mod 4, which 
again is impossible with at least one of u and v odd. Thus GCD(w, D) = 1. Taking 
the equation modulo D and using the invertibility of r and w modulo D, we have 
ar—!w-?(u* — (D/4)v?) = 1 mod D. 

For (b), let p be an odd prime divisor of D. The above congruence then becomes 
ar~'w~?u? = 1 mod p. Similarly with the second form, there is some w’ prime to 
D such that a'r~!w'~?u’? = 1 mod p. Comparing the two expressions, we see that 
a modulo p is the product of a’ and an invertible square. 

For (c), the above congruence becomes ar~ lw? (u2 +2) = 1 mod 4. This forces 
u* + v2 = 1 mod 4. Since w has to be odd, w? = 1 mod 4. Hence ar~! = 1 mod 4. 
Similarly a'r~! = 1 mod 4, and therefore a = a’ mod 4. 

For (d), the above congruence becomes ar '(u2 — (D/4)v?) = 1 mod 8, since w 
is odd. If D/4 = 2 mod 8, we obtain ar~!(u? — 2v?) = 1 mod 8. Here u has to be 
odd, and thus ar7'(1 _ 2v?) = 1 mod 8. If v is even, this says thata =r mod 8; if 
v is odd, it says that a = —r mod 8. Putting this conclusion together with a similar 
conclusion about the second form, we obtain a’ = ta mod 8. 

If D/4 = 6 mod 8, we obtain ar~!(u? + 2v”) = 1 mod 8. Here u has to be odd, 
and thus ar~!(1 + 2v”) = 1 mod 8. If v is even, this says that a = r mod 8; if v 
is odd, it says that a = 3r mod 8. Putting this conclusion together with a similar 
conclusion about the second form, we obtain a’ = a mod 8 or a’ = 3a mod 8. 


For (e), we shall assemble a member of SL(2, Z/DZ) one prime at a time and 
use the Chinese Remainder Theorem. For odd primes p dividing D, choose s, with 


a =s-a mo , and introduce the matrix = -1 )in ; : = 
f i d p,and introduce th ix M S in SL(2, Z/pZ). If D/4 


3 mod 4, introduce the matrix M> = ( ‘ °) in SLQ, Z/4Z). If D/4 = 2 mod 4, let 


Mp = C :) in SL(2, Z/8Z) if D/4 = 6 mod 8, and let My = (| :) in SL(2, Z/8Z) 
if D/4 = 2 mod 8. The Chinese Remainder Theorem produces a unique matrix with 


entries in Z/ DZ that is congruent to M, modulo each odd prime divisor of D and is 


congruent to Mz modulo the power of 2 dividing D. Call this matrix M = ( i 


It has determinant 1 modulo D and hence lies in SL(2, Z/DZ). Then substitution of 
x = ax’ + By’ and y = yx’ + dy’ into the form a(x? — (D/4)y”) modulo D leads 
to the form a’ (x? — (D/4)y?) modulo D. 


34. These problems establish a function from the set of equivalence classes of 
binary quadratic forms over Z with discriminant D, the equivalence relation being 
proper equivalence over Q, onto the set of equivalence classes of binary quadratic 
forms over Z with discriminant D, the equivalence relation being proper equivalence 
over Z/DZ. The number of elements in the domain has to be > the number of 
elements in the range. 
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35. The steps in solving Problems 32 and 33 involve relating a to r modulo 
each prime power dividing D. These relationships are the same as the relationships 
between a and r’ if the form modulo D represents r’ and GCD(r’, D) = 1, and the 
relationships are transitive. Thus the genus characters take the same values at r as 
they do at r’, and they take the same values at a as well. 


36. Multiplication is the operation on proper equivalence classes of forms that 
corresponds to composition of aligned representatives of the classes, and composition 
is defined in such a way that the set of values of the composition is the set of products 
of a value of one form by a value of the other. The values are unaffected by proper 
equivalence over Z. 


37. For (a), D/4 has an odd number 2¢ + 1 of prime factors 4k + 3. Use of the 
Jacobi symbol with a odd and p varying over the prime divisors of D/4 gives 


HG. AG) Gh steve oT 2) i (4) = é@)(PF). 


P p=4k+1 p=4k+ p=4k+1 p=4k+3 


Therefore 


s@ IT) = (Cr) = (24) =). 


For (b) and (c), say that the number of prime factors 4k + 3 of D/8 is t. With 
p varying over the odd prime divisors of D, the same computation as above gives 


IT (@) = €@)'(P4). Then (2) = (2)(P4) = n@é(a)' T] (4). One easily checks 
that 1 is even if D/4 = 2 mod 8 and is odd if D/4 = 6 Aids and the result follows. 


38. For each odd prime divisor p of D, choose a residue r, modulo p such that 
(5) = Sp». If D is even, choose an odd residue rz modulo 8 such that a(r2) = s2. 
The Chinese Remainder Theorem produces an integer b prime to D such that b = 
rp, mod p for the odd p’s and b = r2 mod 8. For this integer b and every k > 0, we 
have (PH?) = r, for each odd p and a(b + kD) = 52. Dirichlet’s Theorem says 
that b + kD is a prime q for a suitable choice of k, and this prime q has the required 
properties. 

39. Problem 37 showed that the product of the genus characters for an odd integer 
a such that GCD(a, D) = 1 is (?). Using the genus characters ata = q, we see 
that (2) = |. Theorem 1|.6b shows that q is primitively representable by some form 
(q, b,c) of discriminant D. The values of the genus characters for this form are 
their values on q, and we have arranged that these values are the various numbers 
Sp. Since there are g + | genus characters and the first g of them can be specified 
arbitrarily and still give a similarity class modulo D, there are at least 2% similarity 
classes modulo D. 

40. Problem 29 shows that the number of classes of type (i) is exactly 2°. Problems 
30-33 show that equivalence of type (i) implies equivalence of type (ii), and they 
therefore give a mapping of the set of classes of type (i) onto the set of classes of 
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type (ii). The definition of “similar modulo D” immediately implies that equivalence 
of type (ii) implies equivalence of type (iii), and therefore we obtain a mapping of 
the set of classes of type (ii) onto the set of classes of type (iii). Finally Problem 39 
shows that there are at least 2° classes of type (iii). The result follows. 


Chapter II 


1. The unital left CG modules correspond (via the universal mapping property of 
a group algebra) to representations of G on complex vector spaces. The theory in 
Chapter VII of Basic Algebra shows that every representation splits as the direct sum 
of irreducible representations, which correspond to simple left CG modules. Hence 
every unital left CG module is semisimple. The left regular representation of G, 
which corresponds to the left CG module CG, decomposes as the sum of irreducible 
representations, each irreducible representation occurring as many times as its degree. 
The sum of all the irreducible subspaces of a given isomorphism type gives one of 
the factors M,,(C) of CG, and every factor arises this way. 

2. For (a), rad A = (C + CX)(X? + 1), and S will be the sum of two copies 
of C. Finding S requires some computation. We can identify A/(rad A) with the 
quotient C[X]/(X* + 1), and direct computation shows that the two idempotents in 
this notation having sum | are x (X +7) and — x (X — i). The proof of Proposition 
2.23 shows how to lift these to idempotents in A. For the first one, put a = x (X +1) 
andb = 1-a= + (X — i), and observe that (ab)? = 0. The proposition 
gives the formula e = Saeer Gia fe = a* + 4a’*b, the term for k = 2 being 0. 
Then e = a(a + 4b) = 1g (X +i)3(—3X + 5i). So one contribution to S comes 
from Ce; the other will come from the complex conjugate in the form of Cf, where 
f = 4% (X — 1)3(-3X — 5i). 

We can check directly that e is an idempotent. In fact, 


e —e =e[ G(X +i)°(-3X +5i) — 1]. 


The polynomial in square brackets vanishes at X = i, and so does its derivative. 
Thus the polynomial is divisible by (X — i)*, and e? — e = (X + i)?(—3X + 5i)x 
[(X — i)? Q(X)] is divisible by (X? + 1)?. 

For (b), the answer is yes. This problem anticipates Problem 5 below. The algebra 
S is spanned linearly by its idempotents, and Problem 5 shows that the idempotents 
are determined uniquely in the commutative case. 

For (c), rad A = (R+ RX)(X? +1). Call the subalgebra So. This subalgebra will 
be a 2-dimensional real subalgebra isomorphic to C. To find it, we can go through 
the proof of Theorem 2.17 or we can use the Galois group. The latter method is a 
good bit easier. Thus we seek those members of S as in (a) that are fixed by complex 
conjugation. Since § = Ce + Ce, the result is that So = R(e + e) + iR(e — é). This 
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is unique; in fact, any choice of So has the property that Sp ®p C is an S for (a), and 
we know that the S for (a) is unique. 

3. Since rad A is a nilpotent ideal of A, (rad A) @pf B is a nilpotent ideal of 
A @r B, and therefore (rad A) ®r B C rad(A @,- B). For the reverse inclusion 
Proposition 2.31 shows that rad(A @r B) = I @f B for some two-sided ideal of A. 
If (rad(A @p B))” =O and aj,..., da, are in J, then (aj ®@ 1)--- (a, © 1) must be 0, 
and hence aj ---d, = 0. Therefore J C rad A, and rad(A @f B) C (rad A) @¢ B. 

4. For (a), suppose on the contrary that there is an infinite sequence M), M2,... 
of distinct maximal ideals. Then we obtain a decreasing sequence of ideals R D> 
M, > M, M2 > M, MoM; QD --- , and the Artinian property shows that M, ---M, = 
M,---M,M,+1 for some n. Since M,41 is prime and M,4; > M,---M,, Mn+ 
contains M; for some j with | < j <n. By maximality, M, = Mj, and we have a 
contradiction. 

In (b), every element of rad R is nilpotent because rad R is nilpotent. Conversely 
if x € R is nilpotent with x” = 0, then Rx is nilpotent with (Rx)” = 0, since 
A|XA2X +++ AynX = A\a2+++ a,x" = 0 for any aj,...,a, € R. Thus Rx C rad R, and 
the nilpotent element x lies in rad R. This proves (b), and (c) follows because R is 
semisimple if and only if rad R = 0. 

For (d), R semisimple implies that R is a product of full matrix rings over division 
rings. Commutativity implies that the matrices are all of size 1-by-1 and the division 
rings are all fields. 


5. If e’ is a second representative, then e’ = e +r withr € rad R. If n is an odd 
integer large enough to have r” = 0, then 


oa = (eo = Ie) tA +5 (- Ik (ie'e —e 


k= 
=+(> (-I!(f) Jee ee tele —e =e 40 ee tee -e =e’ -e. 
k=0 


6. Let M,..., M, be the finitely many maximal ideals, and put N = M---M,. 
Nakayama’s Lemma says that if J is any ideal contained in all maximal ideals, then 
the only finitely generated unital R module M having the property that 1M = M is 
M = 0. The Artinian property shows that N‘+! = N* for some k. We take J = N 
and M = N* in Nakayama’s Lemma. The R module M is finitely generated because 
Artinian implies Noetherian (Theorem 2.15), and hence Nakayama’s Lemma shows 
that Nk = 0. 

7. Let the maximal ideals be Mj,..., M,, and let (M,---M,)* = 0. If Pisa 
prime ideal, then P > 0 = (M --- M,)*. Since P is prime, P contains one of the 
factors. Thus P > M,; for some j. 

8. It helps to have a multiplication table available. If the rows index a factor on 
the left and the columns index a factor on the right, then the resulting products are 

RM 0 
given by (° 0 n). 
00s 


664 Hints for Solutions of Problems 


If Zo is a left ideal of S and J; is a left R submodule of R © M containing MI, 
then RIn = 0, MIn C Jy, and Sh C th. Also, RY Cl, Mi, =0,and Sh = 0. 
Thus AJ; C J; and Akh C J, @ kh. Consequently 1; @ Jy is a left ideal of A. 


In the reverse direction if J is a left ideal in A, then 1) = ( >) JCR@OM 
and Ih = ce J © Sare such that J = 1) ®h. Also,r € R implies ls ) eee. J 


Os 
)sJ ¢ Jy, while RI, = 0 and m € M implies ea, (ox) ~ 


0 
10\ (0m 10 ae 
ae) . ae S (eae 
9. For (a), suppose A is left Noetherian. The table produced in the solution of 


Problem 8 shows that M @ S and R @ M are two-sided ideals of A, and the respective 
quotient rings are R and S. As quotients of a left Noetherian ring, R and S have to be 


left Noetherian. If {M;} is an ascending chain of R submodules of M, then | i c ) | 


is an ascending chain of left ideals of A, by Problem 8. The latter must be constant 
from some point on, and then the same thing is true for {M;}. 

Conversely suppose that R and S are left Noetherian and that the left R module M@ 
satisfies the ascending chain condition. If {Jj} is an ascending chain of left ideals of A, 
then the corresponding sequence {(/2);} is an ascending chain of left ideals in S, and 
{(71);} is an ascending chain of left R submodules of R @ M containing M1. Since 
S is left Noetherian, {(/2);} is constant from some point on. Since R = (R @ M)/M 
and M satisfy the ascending chain condition for their left R submodules, so does 
R @ M, and therefore {(/;);} is constant from some point on. 


0 

- (io)e © I, while (M @ S)I; = 0; ands € S implies oe 
0 
1 


(i)! = 


ooo 

os 

—_— 
NN 
IN 


10. In view of Problem 9a, showing that A is left Noetherian amounts to showing 
that R and S are (left) Noetherian and M™ satisfies the ascending chain condition for 
its left R submodules. The ring S$ is Noetherian by assumption, and R is a field, 
hence is Noetherian. The action of R on M is the action of a field on itself, and the 
R submodules are trivial. In view of Problem 9b, A fails to be right Noetherian if the 
ascending chain condition fails for the right S submodules of M = R. If the ascending 
chain condition were to hold, then R would be a finitely generated S$ module, and 
the only denominators needed for members of the full field R of fractions would be 
those dividing the product of the denominators of the generators; these fractions are 
already in S, and hence S would equal R, contradiction. 

The analogs of the results of Problem 9 for the Artinian case show that A fails to 
be either left or right Artinian if S is not Artinian. If s is a nonunit in S, then the 
chain of principal ideals {(s*)} is properly descending, since (st) = (sk!) implies 
es‘ = sk+! for some unit ¢ and since the hypothesis that S is an integral domain 
allows us to cancel and obtain ¢ = s, contradiction. 


11. Since R and S are fields, they are left and right Noetherian and Artinian. In 
view of Problem 9, we are to show that M = R satisfies both chain conditions for 
its left R modules and neither chain condition for its right S modules. Since R is a 
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field, M = R has only trivial R submodules and satisfies both chain conditions. For 
the S action on R, we are to examine the S vector subspaces of S. Since dims R 
is infinite, there exist both a properly increasing sequence of such subspaces and a 
properly decreasing one. Hence neither chain condition is satisfied. 


12. For (a), the vector-space dimension over F is certainly 4, and computation 
shows that A is closed under products. The choices a = | and b = 0 show that A 
has an identity. 


For (b), let x # 0 be in a two-sided ideal J. If x = (G a ) , then x is invertible, 


and hence J = A. Otherwise suppose that some matrix x = ( ren ) 
ra(b) o(a) 


0 oa) : 


isin J. With c as in the statement of the problem, cx — xc = _rre()Jm 0 


T; this matrix is invertible since b ~ 0, and thus J = A. 
To see that A is central, let x be in the center. The computation 0 = cx — xc shows 


that b = 0. Thus x is of the form (i a i; Such an x does not commute with fe ‘) 


unless a = o(a), in which case x is in F. 


13. The determinant is ao(a) — rbo(b) = Nx ;r(a) —rNx/r(b) and equals 0 
for a given r if and only if some pair (a,b) € (0,0) has Nx;r(a) = rNx/r(b). 
Since r # 0, both a and b are nonzero, and this equality then holds if and only if 
r = Nx;r(ab7'). 

In other words, some nonzero member of A has determinant 0 if r is a norm, and 
then A cannot be a division algebra. Conversely ifr is not a norm, then every nonzero 
member of A is invertible as a matrix. Computation of the inverse matrix shows that 
it has the correct form to be in A. Hence A is a division algebra. 

When A is nota division algebra, it is anyway finite-dimensional and central simple 
and has to be of the form M,,(D) for some n and some division algebra D over F 
such that dim M,,(D) = 4. The dimensional formula says that n? dimp D = 4. Since 
n # 1, we must have n = 2and D= F. 


-1 
: ‘ : c0 a b c0 = 
14. The isomorphism follows from the computation ( fd ) ( Ate is) ( ea ) = 


a be ae a be ce a be 
rc!o(b) a(a)}) ~~ \r'a(c)o(b) o(a) ) ~ \r'o(bc) o(a) )* 


15. Direct computation. 


16. If K isa maximal subfield, then dimr K = 2. Since the characteristic is not 2, 
K = F(./m) for some nonsquare m € F.. Define i € K be to ./m. 

The map f : K —> D givenby f(a+ bi) =a — bi is an algebra homomorphism 
into the central simple algebra D. So the Skolem—Noether Theorem produces j € D 
with j(a + bi)j-! = a — bi for all a + bi in K, necessarily with j invertible. 
As in the proof of Theorem 2.50, j* = r lies in F. Define k = ij. Then k* = 
ijij = i(Gjij-')j* = i(-i)j? = —rm, and —rm = k*? = ijk implies that k = 
—rm(j')@") = —rm er ani) = - ji. 

Let us check the multiplication table for {1, 7, j, k}. We know that iz7=m, Fig =r, 
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k? = -rm, ij =k,and ji = —k. In addition, we have 


jk = jij = Gi YP = Cir = —ri, 
kj sijj=i(P) = ri, 

ki = ijt = iGij")j =i) j = —mj, 
ik =iij = (i?)j = mj. 


Hence the F linear map g from A into the given central simple algebra is an algebra 
homomorphism sending | into 1. Since A is simple, g is one-one. Since A and the 
given algebra both have dimension 4, g is onto. Thus g is an algebra isomorphism. 
(We did not have to check directly that {1, i, j, k} is linearly independent over F’.) 


17. A is an algebra by routinely checking that it is closed under multiplication. 
Manifestly A has an identity and has dimension 9 over F.. If J is a nonzero two-sided 
ideal in A, let x = a+bj+cj7 be nonzero in J, and assume that x is chosen in J such 
that as few of the coefficients a, b, c are nonzero as possible. Possibly by multiplying 
x by j or j” on the right, we may assume thata #4 0. Choosed € K withd,o(d), and 
o*(d) distinct. Computation shows that dx — xd has one fewer nonzero coefficient. 
By minimality we must have dx — xd = 0; hence x must have had just one nonzero 
coefficient. Such an x is invertible, and thus | isin J and J = A. Hence A is simple. 
To see that A has just F as center, we test a general element x = a + bj + cj* for 
commutativity with both d € K and the element j, and we find that b = c = 0 and 
a=o(a) =o07(a). 

18. Since A is finite-dimensional central simple, A = M,(D) for some n and 
some central division algebra D over F. Then 9 = dim A = n? dime D, and the only 
possibilities are that = 3 and D = F, or that n = 1. In the first case, A = M3(F), 
and in the second case, A is a division algebra. In the first case any column of A 
(when viewed as M3(F’)) is a 3-dimensional left A module; in the second case A has 
no proper nonzero left A modules. 


19. Left multiplication by K makes A into a K vector space, and the left K 
submodules of A are the K vector subspaces. The F' dimension of such a subspace 
is 3 times the F dimension. Hence the left K submodules of A are the subspaces of 
K dimension 1, which consist of all left K multiples of any nonzero vector. 

Letx = ag+boj +coj7 be nonzero in A. Then K x isa left A module if and only if 
jx liesin Kx. Here jx = 0 (ao) j +o (bo) j? +0 (co) j*? = ro (co) +a (ao) j +a (bo) j?. 
This equals dx for some d € K if and only if 


ro(co) =daj, o(a9)=dbo, and o(bo) =dco. (*) 
Combining the second and third equations gives the necessary condition thato* (ag) = 


o (dbo) = a(d)o(bo) = o(d)dcg. Applying o gives the necessary condition aj = 
a3 (ao) = o(o(d)dcg) = 07 (d)a(d)o (co) = 07 (d)o (d)r~!day = Nx)r(d)r~‘ao. 
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Thus it is necessary that some d € K have Nx;r(d) =r. Conversely if d € K has 
Nxjr(d) =r, then xo = 1+d7!j + d~!o(d)~!j? has ap = 1, bo = d7!, and 
co = d~!o(d)7} , and we observe that the conditions (*) are satisfied; thus K xo is a 
left A submodule. 


Chapter IIT 


1. For (a), define f : A x K > Endp A by f(a,c)(a@’) = aa'c just as in the 
proof of Theorem 3.3. The verification that the action of right multiplication by b € B 
commutes with f(a, c),i.e., that f(a,c) is in Endge A, uses that B commutes with 
K, and the verification that the extended map f : A @r K — End g A respects 
multiplication uses that K is commutative; otherwise the argument is the same as 
with Theorem 3.3. The algebra A @ - K is central simple over K, and B is an algebra 
over K because B contains K. Since A ® f K is simple, f is one-one. 

For (b), let V be the unique-up-to-isomorphism simple finite-dimensional left B 
module. If the left B module B is the direct sum of m copies of V, then the proof 
of Theorem 2.2 shows that BP = Endg B = M,,(D°), where D® is the central 
division algebra over K given by D° = EndgV. Hence B = M,,(D). If V° 
denotes the unique-up-to-isomorphism simple finite-dimensional left B° module and 
if D’? = Endgo(V°), then we have B = Endgo(B°) = M,,(D’"), and it follows that 
m=m' and D'= D°. 

Since B C A, A is aright B module, hence a left B° module, and A has to 
be the direct sum of some number n of copies of V°. Then the same argument 
gives an isomorphism Endgo A = M,,(D’°) = M,(D). The Double Centralizer 
Theorem gives dimr A = (dimf B)(dimf K), and thus dimx A = dimr B = 
(dimyr K)(dimg B) = (dimg K)(mdimg V). Meanwhile, dimg A = ndimz V 
and thus ndimx V = (dimr K)(mdimgx V). Son = mdimr K. Consequently 
dimr Endg. A = n?dimr D = m?(dimp D)(dimp K)* = (dimp B)(dimp K)* = 
(dimy A)(dimr K) = dimr(A @-f K), and the map f in (a) is onto. 

For (c), application of (b) and an isomorphism from above gives A @rp K = 
Endge(A) = M,,(D), and we have seen that B = M,,(D). Thus A @f K and B lie 
in the same Brauer equivalence class in B(K). 


2. Take the product over o of the equality p(a(o, T))a(p, oT) = a(p,a)a(po,T), 
and get P(TI. a(o, t)) Tl,a(.°) = [],a(.°o)[],a(o, t). Canceling gives 
P(TI. a(o, t)) = [[, a(o, t). Thus [], a(o, t) is fixed by every member of the 
Galois group and is in F*. 

3. Proposition 3.32 and Theorem 3.31 show that HA (Gal(K/F), KX) = 
H?(Gal(K /F), K*) fork > 1 and H**+!(Gal(K/F), K*) = H'(Gal(K/F), K*) 
for k > 0. Then Corollary 3.34 gives H* = F*/Nx)p(K*) for all k > 1, and 
Theorem 3.17 gives H**+! = 0 forall k > 0. Finally H° is the subgroup of elements 
in K~ fixed by Gal(K /F’), and this is F*. 
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4. For (a), it is shown in Chapter IX of Basic Algebra that Q(e27i/ P) is a Galois 
extension of Q with cyclic Galois group of order p — 1 whenever p is prime. Here 
p = 7. Complex conjugation is a member of the Galois group of order 2, and K is the 
subfield fixed by this subgroup. Hence K has degree 6/2 = 3 over Q, and its Galois 
group is the quotient of a cyclic group of order 6 by the subgroup of order 2, hence is 
cyclic of order 3. The powers ¢ Feces C © form a basis of the Q vector space Q(¢), and 
the sums of them with their images under complex conjugation span K. These sums 
are T,, T2, T3. Since there are only 3 such sums, they must be linearly independent 
over Q. Put % = ck + Gh Then t depends only on k mod 7, and % = T_x. 
Hence the only t,’s that are not any of tT), T2, T3 are the ones with k = 0 mod 7. The 
members of the Galois group of Q(¢) carry ¢ to ¢* for 1 < k < 6 and therefore carry 
T, to T;, T2 tO Tx, and 73 to T3,. None of k, 2k, 3k is divisible by 7, and the result 
follows. 

For (b), let o € Gal(K /Q) have o(t) = ™, o(12) = 73, and o(13) = Tt. For 
x € K,we have Nx/g(x) = xo(x)o?(x). With x = aty +b. +13, we get 27 terms 
when everything is expanded out, and they are the ones listed. 

For (c), T) + T2 + 73 = —1 because sy ci =0. Next, tt = ( + mee be 
(+07) =e4+e%4e'4+¢! = 4%, and the other two identities on the 
second line are similar. Finally be = (¢'4¢7!? =¢7424¢- = +2, and the 
other two identities are similar. 

For (d), let a, B, y, 6 be the expressions involving 1), T2, 73 on the right side in 
(b). First we have t — tt = (12 + 2)T = T1172 + 27, = 37, + 73. Summing this 
expression and similar expressions for i and te givesa = 4(t] + 2+ 73) = —4. 
Second 6 = T1273 = (11 + 73)73 = 172 +73 +7, +2 = 1. In (Cd), the coefficient 
of abc isa + 3B = —4+3 = —1, and the coefficient of a? + b* +c? is B = 1. 
Third t7t. = t1(11 + 3) = (2 +2+ (H+) = 3 +2 +2. Similarly 
1373 = 1, +213+2 and 1 = 1])+21,+2. The sumis y = 3(t] +172+73) +6 = 3. 
Fourth t; c =71(34+2) =™m™4+734+2tT). Similarly TT? = 7 +73 + 27 and 
130? = 7 + +273. The sum is 6 = 4(tj + 12 + 73) = —4. 

For (e), the norm modulo 3 is (a? +b? +3) — abe — (a2c + ab* + bc”), and this is 
= (a+b+c) —abc— (a2c +ab? + bc?) mod 3. Any nonzero square is = 1 mod 3, 
and we consider cases. If 3 does not divide abc, then a? = b? = c* = 1 mod 3, and 
the norm is = —abc 4 0 mod 3. If 3 divides a but not bc, then b* = c* = 1 mod 3, 
and the norm is = (b+ c) —b =c #0 mod 3. If 3 divides a and b but not c, then 
the norm is = c ¥ 0 mod 3, while if 3 divides a and c but not b, then the norm is 
= b £0 mod 3. The case that 3 divides all of a, b, c is excluded by the condition that 
GCD(a, b,c) = 1, and all other cases are handled by symmetry. Thus in all cases 
the norm is not divisible by 3. 

For (f), let x, y, z be members of Q not all 0. Choose integers a, b, c and relatively 
prime integers n and d such that x = n~'da, y = nd~'db, z = nd~'c, and 
GCD(a, b,c) = 1. Then Ngjg(xt1 + yo +273) = dn Nx (at) + bt +.c73). 
Applying (e) and supposing that 3 is a norm, we obtain 3 = d~?n3(3k + (1 or 2)) 
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for some integer k. Thus 3d? = n3(3k + (1 or2)). This equality forces n to divide 
d, and we may therefore take n = 1. Thus 3d? = 3k + (Lor2). The left side is 
divisible by 3, and the right side is not. Hence 3 is not a norm. 


5. For (a), Dirichlet’s Theorem (Theorem 1.21) says that there are infinitely many 
primes of the form p = kn + 1. For any such p,n divides p — 1. For (b) with this p, 
the Galois group of Q(e?7"/”)/Q is cyclic of order p — 1 and has a cyclic subgroup 
of order (p — 1)/n. The corresponding subfield is a Galois extension of Q of degree 
n with cyclic Galois group. 

6. ForO <k <nand0 <1 <n, we have xgixg: = pi = fia Meanwhile, 
Xoen equals j*+! if k +1 < n and equals j**!-" ifk +1 > n. So XxgtXgi = Xgev if 
K+] <nand xgkxgi = J? Xgkti-n = Xgtti-n If K +1 > n. Thus a(o*,o') has the 
stated value. 

7. It is just a question of checking that Cot O* (Cyt) = a(o*, o)egiti With a(o*, o!) 
as in the previous problem. 


8. We have 49(1, o*) = 1 — o% and thus 
fod, o*) = 1-08 = (0 —1)(-U +0 +--+ +0*})). 


If we put fi, o*) = —(1t+o0+---+o*—!), then we have Td, o)y= food, o*) 
for all k. 

Next, for k < 1, we have 0;(1,0*%,0') = (o*,o') — (,o') + (l,o*) = 
o*(1, 0'-*) — (1, 0') + (1, o*). Then fiaid, o*,o') equals 


= dat opel Ye Oto uct o He dSe seco 5 =0, 


For k > 1, the term (o*, 0) is replaced by ok(1,o"*'-*), Thus 01(1, 0%, 0!) = 
ok, ot’) — (1,0!) + (1, 0%). Then f19; (1, o*, a) is 
—o*'(1to+---to%*))4 tot: to !)-d+tot-:- tot!) 
=-(lto+--to"™ 4 (+o+---+0'4) 
=o (et ost se?s4)), 


If we define f> as in the problem, then in the two cases we have 


k<I:  Nfp,o*,o')=(4+o4+---+6"')0) =0= fii, o*, 0), 
| acy Be Nfol,o*,o') = (Ito +-:-+0"!)(-0') = fi, o*, 0). 


9. To w in Homzg(ZG, K*), the chain map of the previous problem asso- 
ciates fo f2 in Homzg (ZG ({(1, g1, g2)}), K *), and then the corresponding member 
of CG, K*) is ®o(Wf2) whose value at (g1, g2) is Wf2C1, g2, g1g2). That is, 
O7(Wf2)(o*,0') = whol, o*, o*), and this by Problem 8 is y(0) ifk +1 <n 
and is w(—o**!) = wot")! fk +1 > 2. 
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10. Taking Proposition 3.32 into account, we see that the mapping whose kernel 
gives the cocycles is Hom(T, 1) : Homzg(ZG, K*) — Homzg(ZG, K*). Here 
Hom(T, lw = woT. We are identifying w with w(1) and also w o T with 
W(T()) = wo — 1) = (o — 1)W(1) in additive notation. Hence the effect of 
Hom(T, 1) is to carry y to o(y)y~! in multiplicative notation. A necessary and 
sufficient condition for o (y)y~! to be 1 is that y be in F*, since the subgroup of K * 
fixed by G is F*. 

11. Since w(0) = Land w(o**""") = o +" (1) = W(1) = r~!, the member a 
of CG, K”%) that corresponds to y has 


1 ifk+l<n, 


a(o*,o') = 
r ifk+l>n, 


and this is the 2-cocycle of Problem 6. 


12. Corollary 3.34 and Theorem 3.14 combine to give us a group isomorphism 
B(K/F) = F* /NxyjF (K*), and the above problems show that the element r of F* 
used in defining A corresponds under this isomorphism to the coset of r~!. Hence 
the order of the Brauer equivalence class of A equals the order of the coset of r, as 
required. 

If A is not a division algebra, then A = M,,(D) for some central division algebra 
D over F and for some integer m > 1. Here dimp D = (n/m)? < n?. Corollary 
3.15 then gives the contradiction that the order of the Brauer equivalence class of D, 
which is the same as the order of the class of A, divides n/m, which in turn is < n. 


13. The Skolem—Noether Theorem shows that the image matrices under two 
different isomorphisms g and w have to be conjugate to one another, say with g = 
C-!wC. Then 


det(y(X1 — a @ 1)) = det(C~! w(C(X1 —a @ 1))) 
= (detC)~! det(w(X1 — a @ 1))(detC) 
= det(W(X1 —a@ 1)). 


14. Let B = A @f K. The left B module B is semisimple and is the direct sum 
of n isomorphic simple modules of dimension n. On each the operation of a ® | has 
characteristic polynomial det(X 1 — a @ 1), and the characteristic polynomial for the 
direct sum of the spaces is the product of the characteristic polynomials. 


15. Arguing by contradiction, we may assume that the statement is false for 
some monic P = P(X) and that P has the lowest possible degree among all monic 
polynomials for which the assertion is false. Factor P over K into powers of distinct 
irreducible polynomials as P = pe ' Pe *. The n-fold product of pd : sew PY . 
with itself is in F[X] by assumption and is therefore invariant under Gal(K/F). 
Consequently for each o € Gal(K/F) and each P;, there exists some P; such that 
P; = o(P;). It follows that if H is the subgroup of G = Gal(K/F’) fixing P}, then 
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O = | Lessee jy OP 1s the product of distinct irreducible factors of P and hence 
divides P. The polynomial Q is fixed by every member of G and hence is monic 
in F[X]. Thus Q # P. Then Q” is in F[X], and hence (P/Q)” is in F[X]. The 
fact that P is not in F[X] implies that Q # P. Therefore deg(P/Q) < deg P. By 
the minimal choice of deg P, P/Q is in F[X]. Therefore P = (P/Q)Q is in F[X], 
contradiction. 

16. For a matrix m with entries in a field, passing to a larger field does not change 
det(X 1 — m). Suppose we start with two finite Galois extensions K; and K2 of F 
that split A. Let K, be a splitting field for a polynomial g; € F[X], and let K2 be 
a splitting field for g7 €¢ F[X]. Define K to be a splitting field for g;g.. Then K is 
a finite Galois extension of F’, and we can regard it as containing both K; and K>. 
Applying the first sentence of this paragraph first to K; and K and then to K2 and K, 
we see that the reduced characteristic polynomial is the same over K, as it is over K2. 

17. The formulas for Nrd4/;-(ab) and Nrd,;r(1) follow from properties of 
determinants. From Problem 14 we observe that deta = (—1)" det(—a) and 
det(—g(a ® 1)) = (—1)’ det(g(a @ 1)). Substituting X = 0 into the formula 
therefore gives us Ny; (a) = deta = (-1)" det(—a) = (-1)” det(—g(a@1))”? = 
(-1)" ((-D)”)" det(g(a @ 1))” = det(g(a ® 1))" = Nrda/r(a)”. If a is invert- 
ible, then 1 = Nrda/p(1) = Nrdayp(aa~!) = Nrda;r(a)Nrd(a7!) shows that 
Nrd,4 /r(a@) is nonzero. Conversely if Nrd4/r(a) ¢ 0, then Nrd4;-(a) 4 0 and hence 
det L(a) # 0. If P(X) is the algebra polynomial of L(a), then the Cayley—Hamilton 
Theorem shows that P(L(a)) = 0. Since det L(a) 4 0, P(X) has a nonzero constant 
term. Therefore we can separate the constant term in the equation P(L(a)) = 0 to 
exhibit an identity of the form L(a)Q(L(a)) = 1 for some polynomial Q(X), and 
the element Q(a) is a 2-sided inverse to a in A. This proves (a), and the conclusion 
about division algebras is immediate. 


18. The definition gives 
m(dXp) = Yi M(d)a(H, P)E yup 
Uw 
m(cxz) = Yi o(c)ato, T) Egor, 


m((dxp)(cxz)) = m(dp()a(p, T)X pr) = DY u(de()a(p, 1))a(t, PTE pupr- 
Uw 


Also we have 
m(dxp)m(dxp) = Yo u(d)a(u, p)o(c)a(a, T) Ep ppEo,or 
Lo 


=) u(d)up(c)a(u, p)a(up, T)Ep ppc: 
bw 


This matches m((dxp)(cxz)) by the cocycle relation for a. 
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For the reduced norm we have two one-one F algebra homomorphisms of A into 
M,,(K), one via the mapping m above and one by the embedding A > A @r 1 C 
A@, K = M,(K), and these are conjugate by the Skolem—Noether Theorem. Hence 
the determinant gives the same result in the two cases. The determinant in the second 
case gives the reduced norm, and hence it must give the reduced norm in the first case. 


19. The algebra H can be realized as all complex matrices x = ( a a) and 


Nrdyja(x) = |a|?+|B|? and Nie (x) = (\a|?+|6|7)? as aspecial case of Problem 18. 


20. Let D be a finite-dimensional central division algebra over F, say with 


dimr D = n?. Choose a basis {xz} of D over F, and expand elements of D 


asx = ae exe. “The function P(ci..2262)-= Nrdpyr( cjx;) is easily 
checked to be a homogeneous polynomial of degree n in n* variables, and condition 
(C1) says that it has a nontrivial zero ifn < n?. In this case the corresponding member 
x of D would be a nonzero element of D that fails to be invertible, and there is no such 
element. We conclude that n < n? is false, and that means that n = 1. Therefore F 


is the only finite-dimensional central division algebra over F, and B(F’) = 0. 
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1. For (a), every free abelian group of finite rank is in the category, and such 
groups provide enough projectives. 

Let 1 = F @T bea decomposition of an injective J as the direct sum of a free 
abelian group F of rank k and a torsion group T. The sequence 0 > F @T > 
2F @T — (Z/2Z)* — 0 is exact but not split unless k = 0, and thus F = 0. Thus 
every injective in the category is a finite group, and no infinite group in the category 
embeds into an injective. 

For (b), every abelian group and in particular every torsion abelian group is a 
subgroup of a divisible group. The torsion subgroup of the divisible group is still 
divisible and is still an injective, and thus every group in the category embeds in an 
injective in the category. 

Let P be a projective in the category mapping onto Z/2Z = {0, 1} by a homo- 
morphism Tt, and let x be an element of P with t(x) = 1. If g is a generator of a 
cyclic group G of order 2", then there is a homomorphism ¢ of G onto Z/2Z with 
y(g) = t(x) = 1. Since P is projective, there exists a homomorphism o : P > G 
with go = Tt, and then we have | = t(x) = go (x). Then o(x) = g” for some odd 
integer m, and this has order 2*. Hence x has order at least 2". Since k is arbitrary, 
x must have infinite order. But all groups in the category are torsion groups, and P 
therefore cannot exist. 


2. Let p be a prime, and let C be the category of all abelian groups that are the 
underlying additive group of a vector space over the field of p elements. This category 
coincides with the category of all direct sums of copies of Z/pZ. Every such abelian 
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group is projective and injective for the category. 


3. Every unital left R module is the direct sum of simple R modules. Hence every 
short exact sequence splits, and every module is both projective and injective for Cr. 


4. For (a), let J be injective. Givenx € J anda £O0in R, let B =C = R, let 
t: R— Ihave t(r) =rx,andletg: R > R have g(r) = ra. Setting up Figure 
4.4, we obtain o : R > I with t = og. If we put y = o (1) and evaluate both sides 
at |, then we obtain x = t(1) = o (g(1)) = o(a) = ao (1) = ay, as required. 

For (b), suppose that the unital left R module / is divisible. Suppose that J is an 
ideal of R, and write J = (a). Let g : J — I be an R homomorphism. Since J is 
divisible, there exists y in J withay = g(a). Then g extends to the R homomorphism 
® with ®(1) = y. By Proposition 4.15, J is injective. 


5. Proposition 4.20 shows that there exists an injective J) containing an isomorphic 
copy M of M. Problem 4 shows that Jo is divisible, and hence 1; = Ip /M is divisible. 
By Problem 4, /; isinjective. Then0 > M — Ip > I, — Oisaninjective resolution 
of M. 


6. If amodule M in C is given, we form the appropriate kind of resolution X in C 
needed to compute the derived functors of G, and the same X will be appropriate for 
computing the derived functors of F o G. The derived functors of G come from the 
homology or cohomology of G(X) with G(M) removed, and the derived functors of 
F oGcome similarly from F(G(X)). Thus the result follows from Proposition 4.4. 


7. If amodule M in C is given, we form the appropriate kind of resolution X in C 
needed to compute the derived functors of Go F on M. Then F(X) is the appropriate 
kind of resolution for computing the derived functors of G on F(M), and the result 
follows. 


8. For n odd, H”(G, M) is the cohomology of the complex 
Homzg (ZG, M) “— Homzg¢ (ZG, M) <— Homzg(ZG, M), 

while for n even, H"(G, M) is the cohomology of the complex 
Homz¢(ZG, M) <— Homzg(ZG, M) “— Homzg(ZG, M). 


This proves the isomorphisms concerning cohomology. For n odd H,(G, M) is the 
homology of the complex 


ZG ®zq M —> ZG ®zq M —> ZG @zc M, 
while for n even, H,,(G, M) is the homology of the complex 


ZG 2g M —> ZG @2g M > ZG @z¢ M. 


This proves the isomorphisms concerning homology. 
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9. For (a), let Tap : Homp(F'(A), B) > Home(A, G(B)) be the natural isomor- 
phism. Naturality in B says for any y : B > B’ that we have 


Home(14, G(W)) o Tag = Tap 0 Homp (1 Fa), W) 


on Homp(F (A), B). Let P be projective in C. We are to prove that F(P) is 
projective in D, thus to prove that Homp(F(P), -) is exact. We need to show that 
whenever y : B — B’ is onto in D, then Homp(1 rp), Y) is onto. By hypothesis, 
G(w) : G(B) > G(B’) is onto in C. The displayed equation with A = P has 
Homce(1p, G(y)) onto, and Tpgz and Tp are given as isomorphisms. Therefore 
Homp(1,p), Y) is onto, as we were to show. The proof of (b) is similar. 


10. Conclusion (a) follows from the natural isomorphism Homs(P; A,B) = 
Homs(S @r A, B) = Homa(A, FEB), Conclusion (b) follows from Problem 9a 
with F = Pz and G = F&, since F% is exact and therefore carries onto maps to onto 
maps. For (c), Pe A is given by the tensor product S @ x A, and this tensor product is 
an exact functor of A if S is projective as a right R module, by Proposition 4.19a. 

For (d), part (c) says that M bh Ps M is an exact functor. Taking it to be F in 
Problem 7a and G to be Homs(-, NV), we have Exth(P?M, N) = G‘(F(M)). Prob- 
lem 7a says that this is equal to (G o F)*. Since (Go F)(M) = Homs(P2M, N)= 
Homa (M, FEN) has (Go F)‘(M) = Exth(M, FEN), we obtain Ext,(P? M, N) = 
Exth(M, FEN). 

For (e), (b) shows that the chain complex P? X is projective over P3 M, and we 
are assuming that Y is exact (and projective) over P M. Theorem 4.12 says that the 
identity map on P;M extends to a chain map f : Pg X — Y that is unique up to 
homotopy. Dropping the terms in degree —1 and applying the functor Homs(- , NV) 
to the diagram gives us a cochain map from the complex Homs(Y, NV) to the complex 
Homs (Pp X,N) = Home(X, Fe N). Thus we get homomorphisms on cohomology 
Ext; (P&M, N) > Ext,(M, FEN). 

11. Conclusion (a) follows from the natural isomorphisms Homs(A, I . B) = 
Hom;(A, Homer(S, B)) = Homr(S ®s A, B) = Homa (FSA, B). Conclusion (b) 
follows from Problem 9b because ua is exact and therefore carries one-one maps 
to one-one maps. For (c), IS = Homa(S, -) is exact if S is projective as a right R 
module, by Proposition 4.19a. 

For (d), part (c) says that M bh I AM is an exact functor. Taking it to be F 
in Problem 7b and G to be Homs(M, -), we have Ext((M, IRN) = G‘(F(N)). 
Problem 7b says that this is equal to (Go F)*. Since (GoF)(N) = Homs(M, IRN) = 
Homa (F%M, N) has (Go F)*(M) = Exth(FRM, N), we obtain Ext((N, 12N) = 
Exth, (FEM, N). 

For (e), (b) shows that the cochain complex J - X is injective over I : N, and we 
are assuming that Y is exact (and injective) over [ . N. Theorem 4.16 says that the 
identity map on JN extends to a cochain map f : Y — 1X that is unique up to 
homotopy. Dropping the terms in degree —1 and applying the functor Homs(M, -) 
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to the diagram gives us a cochain map from the complex Homs(M, Y) to the complex 
Hom;(M, I e X) = Home re M, X). Thus we get homomorphisms on cohomology 
Ext§(M, 18N) > Exte(F§M, N). 


12. For (a), the definition of ®, is 


(PoP) (81, ---5 8) = OC, 81, 9182, ---, 81° Bq) 


for y ¢ Homzg(Fy, M). Putting f = ®,¢ gives (p* f)(g1,.--, 8q) = O* (PgG) = 
D,(y 0 p) = (Py) © p, as asserted. 

For inflation the groups are (G, G’) = (G,G/#H), and the map p is the quo- 
tient map; the effect is given by (Inf f)(g1,...,8,) = f(giAt,..., 8H) for f in 
C1(G/H,M*"). For restriction the groups are (G,G') = (H,G), and the map 
is the inclusion; the effect is given by (Resy)(hj,...,4g) = W(h1,...,Aq) for 
w eC(G, M). 

For (b), let f be in C'(G/H,M"”). Then Res(Inf(f))(h) = Inf(f)(h) = 
f(hH) = f(A). The condition for f to be a cocycle is that 5; f = 0, i-e., that 
f(uv) = fq) +u(f (v)) for u and vin G/H. Taking u and v to be the identity coset 
H shows that f(H) = 0. 

For (c), let f € C'\(G/H, M") be acocycle. Then Inf(f)(g) = f (gH). If this 
is a coboundary in cl(G, M), then there exists Y € M with dow = f,i.., with 
St (gH) = gw — w for all g. The left side depends only on the coset gH, and hence 
so must the right side. Then it follows that ghy = gw for all h € H and that 
wv isin M”. Then the formula f(gH) = gy — w exhibits f as a coboundary in 
C!\(G/H, M*). 

For (d), let f be a cocycle in C!(G, M) such that Resf is a coboundary in 
C!(H, M). The formula is (Res fo(h) = f (A), and the coboundary condition shows 
that there is some y € M” with f(h) = hw — w forh € H. Since wy is in M", 
fh) = Ofor all h € H. The cocycle condition on f is that f(uv) = f(u)+u(f(v)) 
for all u and v in G. Taking v to be in A shows that f(gh) = f(g) forallh € H. 
Taking instead u to be in H shows that f(hg) = h(f(g)) for all h € H. Since H is 
normal, h(f(g)) = f(g) for all h € H. Therefore f takes values in M® and is Inf 
of the cocycle f in C'!(G/H, M") given by f(gH) = f(g). 


13. For (a), we have (g0%m)(8) = Ym(880) = 880M = Pgom(g), and m +> Pm is a 
ZG homomorphism. Suppose that ¢,, = 0. Then gm = 0 for all g and in particular 
for g = 1. Therefore m = 0, andm +> @, is one-one. Then it follows that the 
sequence is exact. 

For (b), we know that ZG as an abelian group is free abelian. Then Problem 11d 
shows that H'(G, B) =Ext,¢(Z, B) =Ext},,(Z, I7° (FZ,M)) = Exti,(Z, FZ,M). 
Since Homz(Z, -) is exact from Cz to itself, Ext’ (Z, FEM) =Ofork > 1. 

For (c), a Z basis of ZG consists of all 1-tuples (g) with g € G, and a Z basis of 
ZA consists of all (h) with h € H. Let {v} be a set of representatives of the cosets of 
G/H,and let A be the free abelian group on {v}. The Z-bilinear map (v, (1)) & (vA) 
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extends to a homomorphism of A @z ZH into ZG that is manifestly onto, and it is 
one-one because )> n;(v;h;) = 0 implies n; = 0 for alli. Thus it is an isomorphism. 

For (d), use of (c) gives F747 B = FZ4 Homz(ZG, M) = Homz (FZ (ZG), M) 
~ Homz(A@zZH, M) = Homz(ZH, Homz(A, M)), and then H*(H, FEE B) =0 
for k > 1 by the same argument as in (b). 

For (e), the long exact sequence for Ext;,(Z, -) that comes from the short exact 
sequence in (a) shows that 0 —> H°(H,M) > H°(H,B) > H°(H,N) > 
H'(H, M) is exact. The right member is assumed to be 0, and the three middle 
members are isomorphic to M", B”, and N”. 

For (f), consider the Z bilinear map (1, (g)) (gH) of Z x ZG into Z(G/H), and 
extend it to a Z linear map of Z @z ZG into Z(G/H). The group H acts trivially on 
Z on the right, and it acts on Z(G/H) by left translation. Let h be in H. The passage 
Zx ZG > Z(G/H) has (1h, (g)) & (gH) and (1,h(g)) & h(gH) = (gH); 
thus the group homomorphism Z @z ZG — Z(G/H) descends to a homomorphism 
of Z @zu ZG into Z(G/H). This is certainly onto. To see that it is one-one, let 
; njl ®@ (gj) + O. Then xs n,(g;H) = O, and for each coset representative v 
in G. Do cyy Mi(Bi) = 0. So); ni(h;'v) = 0, and (30, nj(h;'))(@) = 0. Then 
ys: nih!) = 0 in ZH because (v) is invertible in ZG, and it follows that the map 
is one-one. 

For (g), (f) gives BY = Homzy(Z, Homz(ZG, M)) = Homz(Z @zy ZG, M) = 
Homz(Z(G/H), M), and the same argument as in (b) shows that H* (G/H, B") =0 
fork > 1. 

Conclusion (h) is immediate because g > 2 and because all the cohomology 
associated with B has been shown to be 0 in degrees > 1. 

The commutativity in conclusion (i) follows because the inflation and restriction 
mappings are clearly functorial. The vertical mappings have been shown to be 
isomorphisms in (h). To see via induction that the top row is exact, we have to 
verify that H*(H,N) = 0 fork < q — 2; but H*(H, N) = H*+!(H, M) for all 
k > 1,and H*t!(H, M) is assumed to be 0 fork +1 < q — 1. Therefore the bottom 
row is exact, and the induction is complete. 


14-16. These problems are routine verifications. 


17. Part (a) follows because R ®p A is naturally isomorphic to A. For (b), F @r A 
= Byes (Fs @r A) and lr ® f corresponds to @ (1p, ®@ f). The values of the 
various R homomorphisms are in the various spaces F; ®r B, whose sum is direct, 
and thus the kernel of 1- ® f is the direct sum of the kernels. Then (b) follows. For 
(c), we see from (a) and (b) that free R modules are flat. In Cr, every projective is a 
direct summand of a free module, and thus (c) follows by a second application of (b). 


18. Consider 1® f: M@rA—> M Sp B. Any element of ker(1 ® f) is a finite 
sum )* m; @ aj, and this lies in ker((1 ® Pigos where F is the finite set of indices 
in question. Thus ker(1 @ f) 4 0 implies ker((1 ® Pe) # 0 for some F. The 
converse is immediate because ker((1 ® f )| M,) Cker(1 ® f) for all F. 
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19. The long exact sequence for tensor product over R is of the form 
--+ > Torp (A, F) > Torf(A, B) > A@r K > A@r F > A@RB 0, 


and Tor} (A, F) = 0 because F is projective for Cr. This establishes the exactness 
of the sequence in the problem. If A is flat, then 


0 > Torf(A, B) > A@r K > A@r F > ASR BO 


is exact for each B, and Tor (A, B) must be 0 for each B. Conversely if Tort (A, B) 
is O for each B, then A ®p (- ) is an exact functor by Proposition 4.3. Hence A is flat 
by definition. 


20. On the one hand, the long exact sequence associated to tensoring the short 
exact sequence given in (a) by B is of the form 


0 > Tor*(M, B) > Torf(T(M), B) > F@rB > M@rB > T(M)@rB— 0, 


since F free implies ory, (F, B) = 0. On the other hand, the given short exact 
sequence splits, and tensoring it by B must directly produce a short exact sequence 


0O> F@rB>M@RB>T(M) @RB-— 0. 
Thus ker(F ®r B > M ®p B) = 0, and we must therefore have 
image(Tor}(T(M), B) > F @r B) =ker(F @x B > M @r B) = 0. 


Consequently 0 > Torf(M, B)> Tor (T(M), B) — 0 is exact. This proves (a). 

For (b), Problem 18 shows that M is flat if and only if each Mr is flat, and 
(a) in combination with Problem 19 shows that each Mr is flat if and only if each 
T (Mf) is flat. Now suppose that M is flat, so that T(M-f) is flat for each finite 
subset F of M. This is true in particular for each finite subset F’ of T(M), and 
T (Mr) = Mp = (T(M))r-. Hence Problem 18 shows that T (M) is flat. Conversely 
suppose that T(M) is flat. Then T(M)-- is flat for each finite subset F’ of T(M). 
Let F be a finite subset of M. Then M, is a finitely generated R submodule, and 
the structure theorem shows that T(M,) is finitely generated. Let F’ be a set of 
generators for it. Then T(Mr) = Mr = T(M)--. This is flat by Problem 18, since 
T (M) is flat, and the first sentence of this paragraph allows us to conclude that M is 
flat. 

For (c), T(M) ~ 0 means that am = 0 for some nonzero a € Randm é€ M. 
Let i : (a) > R be the inclusion, which is one-one. Theni @ 1: (a2) @ry M > 
R @r M = M has (i ® 1)(a ®m) = am = O. Thus the one-one map 7 is carried to 
the map i @ | that is not one-one, and tensoring with M is not exact. So M is not flat. 

For (d), if M is flat, then T(M) = 0 by (c). Conversely if T(M) = 0, then T(M) 
is flat, and (b) shows that M is flat. 
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21. Since 0), , and a7, both lower p + q by 1, they both carry Ep4, to Ep+g-1. 


Also, the hypotheses give (0, , + Oey = Ca ie + OD ahileg + OF ing + 
Cat Og = 9, and we have a chain complex. 


22. We compute that eee = (ap_-1 ® 1)(@p @ 1) = ay_ja, @1 = O, 
418g + 8 gdhg = (p-1 ®@ I(-1)P( @ By) + DPA ®@ B,)x 
(a, @ 1) = (—1)"(ap ® By) — (—1)"(ap @ By) = 0, and that a”, ,9”, = 
(—1)?(1 © Bg-1)(—1)? @ By) = 1 ®@ Bg-1 By = 9. 

23. The formulas for Oy and OF show that ker O54 = keray ®r Dy, and that 
ker 07, = Cp @r ker By. Since 0, Ep and 0) Epq lie in independent spaces, 
ker(0), Ter) = ker 07g (ker Org = kera, @rker B,. Similarly Ds tqEptia) = 
&p+1(Cp+1) @r Dg and 9? |. | (Ep.g+1) = Cp @r Bg+1(Dq+1), and hence 


image(d,,,1 = OF ati) = Opys1(Cp41) @r Dy + Cp @r Bo+i(Dq+1)- 


Thus if c is in C,, d is in Dg, c’ is in @p41(Cp41), and d’ is in By+1(Dq+1), then 
(Oo + a” (Cc +c) @(dt+ d')) is the sum of (Oi + 05 ge ® d) and three 
terms that are in image(0’, gk ar a 41). Consequently we obtain a well-defined 


homomorphism of H,(C) @p H,(D) into Hp44(E). 


24. Let a’ and a” be the boundary operators; these satisfy 0'0” = —0”0’. Leta 
be a cycle in E_1, ie., let 0’a = 0. Since 0’a = 0, the exactness for 0’ produces 
con € Fox with a = d'cox. Since d”a = 0, this has d’0”cox = —0”0'coxn = 
—d"a = 0. Now suppose inductively oni > 0 that j > 0 is defined byi+ j = k and 
that c;,; € E;,; is given with 0’0"c;,; = 0. By the assumed exactness, 0’0’c;,; = 0 
implies 0”’c;,; = 0'ci+1,j;-1 for some cj+1,;-1 € Ei+1,;-1, and then 0/0”cj41,;-1 = 
—0"0'ci41,j-1 = —9"0"c;,; = 0. The induction leads us nonuniquely to czo € Exo 
such that 0'0”cx.9 = 0. Define b € Ex.-1 by b = 0"cx.0, and then 0’b = 0. The result 
of the construction is therefore that we pass nonuniquely from the cocyclea € E_1 x 
for 0” toa cocycle b € E,_, for 0’. 

Inverting the steps and the choices, we see that we can pass from b back toa. Thus if 
we can address the nonuniqueness, then the isomorphism in homology will have been 
established. We are to show that if a € E_1., at the start is a boundary relative to 0”, 
then any system of choices leads to a result b € E,,_1 that is a boundary for 0’. Since 
ais assumed to be a boundary for a”,a = 0a’ witha’ € E_1,41. The element a’ has 
d’a' = 0, and thus a’ = —0'ag.,41 for some ag.441 € Eo.4+1. Meanwhile, the above 
construction makes a = 0/co.x. So 0/0" a0, 441 = —9"0'ao441 = 0"a' =a = O'COK. 
By exactness, cox — 0” 0,441 = 0'b1, for some bi, € E1%. This proves that co,, is 
of the form co.¢ = 0’ a0,.441 + 0'b1,~ With aox41 € Eoxy1 andbi, € E1. (Note that 
this form for co,, already implies that d'0”co,, = 0.) 

Now suppose inductively oni > 0 that 7 > O is defined by i + j = k and 
that cj; € Ej,; is given with c;,; = 0”a;,;41 + 0'bj41,;. The constructed element 
cis,j-1 € Eji4i,j-1 has 0”c;,; = 0'ci41,;-1 for some cj+1,;-1 € Ej+1,j-1. Thus 
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0'Ci41,j-1 = 0” 0'bi41, j = —0'0" bi41j; and Cit1,j-1 + Oo Di41, = 0’bj42, j-1- If 
we put aj41,; = —bi+1,;, then we have cj41,j-1 = 0” aj41,; + 0’bi42,;-1, and the 
induction goes through to i = k. Consequently any choice of c,,9 obtained starting 
from the boundary a is of the form cx,9 = 0” ax,1 + 0’bx+1,0. The final step is to define 
b = 0"cx.9, and then we have b = 0”0'by41,.0 = —90'0”be+1.0, and b is exhibited as a 
boundary relative to 0’. 

25. Since each C, is projective for p > 0,Cp @r D is exact. Similarly C @r D, is 
exact for g > 0. The hypotheses of Problem 24 are satisfied, and the two homologies 
match. 

26. Ho(C) = Ho(C’) = Ho(D) = Z/2Z, and H,(C) = H,(C’) = H,(D) = 
0 for p 4 0. Ao(C @z D) = Ho(C' @z D) = Z/2Z, H\(C ®@z D) = O and 
H\(C’ ®@z D) = Z/2Z, H,(C ®z D) = H,(C' ®@z D) = 0 for p ¢ {0, 1}. 

27. Let Zp = ker 0, Cc Cy, Bp = image One ¢ Cp, and B, = By_1. Since R 
is a principal ideal domain, Problem 20 shows that flat is equivalent to torsion free. 
Modules of the complex C are flat by assumption, hence torsion free. Modules of Z 
and B’ are R submodules of these, hence are torsion free, hence are flat. 


28. The long exact sequence in homology shows that 
Tort (B’, D) > Z@r D> C@rD— B'@rD—0 


is exact. Since B’ is flat, Problem 19 shows that Torf (B’, D) =0. 

29. For (a), the boundary map on B, ®r D, in B’@r Dis 0'@1+(—-1)?1 8a"), 
and 0’ = 0 on boundaries in B’. 

For (b), tensoring with B’ is an exact functor, since B’ is flat. Therefore the 


= a” = ocsteat 
exactness of 0 > Z > D > B’ > 0 implies the exactness of 


0 > (B! @p Zn > (B’ @p Dn “>” (BOR B)n > 0 
for each n. From the exactness of this sequence, we can read off that ker(1 @ 0”), 
within (B’@prD), is (B'@pZ), and that image(1@0”),, on (B’@rD), is (B’@rB ns 
which is the same thing as (B’ @pr B)n_1. 
For (c), the results of (b) show that 


H,(B' @p D) = ker(1 @ 9”), /image(1 @ 8” )n41 = (B’ @g Z)n/(B' @ Bn. 


Since tensoring with B’ is exact, the exactness of 0 > B—-> Z-— H(D)>0 
implies the exactness of 


0 => B' @r B= B' @rZ— B @r H(D) > 0 


in each degree. Thus B’ @g H(D) = (B' @p Z)/(B' @pr B), and H,(B’ @r D) = 
(B' @ H(D))n = (B @r H(D)n-1. 
Part (d) is handled in a fashion similar to (c). 
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30. For (a), Tort (Z, H(D)) = 0 because Z is flat. 

In (b), comparison of the exact sequence with kerw,_; with the exact sequence 
displayed before part (a) (but with n replaced by n — 1) shows that kerw,_1 is 
isomorphic to Tor} (H(C), H(D)),-1. Substituting for ker @,_1 and incorporating 
the isomorphism into the mapping into H,(B’ @r D) leads to f/_, as the one-one 
mapping. 

In (c), we have 


coker(t @ 1) = Ay, (C @r D)/ image(t, @ 1) = Hy(C @r D)/ker(d), @ 1) 
~ image(d’ @ 1) = ker@,_1 = Tor{(H(C), H(D))n-1. 


The composition of maps leading from H,,(C @r D) to H,(B’ ®r D) has to be 
0), ® 1, and thus 6’ |B, = 0, @ 1. The map £,_1, apart from isomorphisms, is 
onto because q was constructed as onto. 

Part (d) is completely analogous, and the resulting map @,, is one-one. 

For (e), we know that @ is one-one and that 6 is onto. Also, we have By Bn—-1n ee), 
= (0, ® 1), @ 1) = 0. Since B/_, is one-one and a, is onto, B,_ja@, = 0. 
Finally suppose that x is in ker ,_;. Then x is in ker(6’_,B,-1) = ker(0/, ®@ 1) = 
image(., ® 1) = image(@,a/j,) = image a,. This completes the proof of exactness. 

31. This is immediate. 


32. Let X = {X,} and Y = {Y,}. Then Morph(X, Y) is the subgroup of 
TT. Hom(X,,, Y,,) consisting of those elements in the product satisfying the chain 
map conditions. A zero object is any tuple of 0’s, and certainly product and coproduct 
make sense. One readily verifies that the tuple of kernels of a chain map furnishes a 


kernel for a chain map and that the tuple of cokernels furnishes a cokernel. 

33. The additional objects and morphisms at the top of the extended diagram are 
Co = 2Z2/8Z, Bo = Z,k given by 2 mod 8 +> 2 mod 8, k given by x 2, v given by 
1+> 2 mod 8, and @ given by x 4. Since the composition of k followed by B=x2 
is not O, (Bo, k) cannot be the kernel of 6. 

The additional objects and morphisms at the bottom of the extended diagram are 
Ap = Z/4Z, By = Z/16Z, p given by 1 +> 1 mod 4, p given by 1 + 1 mod 16, 9’ 
given by 1 mod 4+> 4 mod 16, and y given by | mod 16+> | mod 4. 


34. We give the argument only for Hom(M, -). LetO > A 5B us C —> Obe 
a given exact sequence, and form the sequence 


H 1, 
0 ——> Homi, A) 22”, Hom(M, B) =” Hom(M, C). 


We are to show that Hom(1, ~) is one-one and that exactness holds at Hom(M, B). 
If o is in Hom(M, A) with Hom(1, g)(o) = 0, then go = 0, and it follows that 
o = 0 because g is a monomorphism. 
For the exactness at Hom(M, B), we use Theorem 4.42e. We know immediately 
that Hom(1, Ww) Hom(1, g) = Hom(1, y~) = Hom(1, 0) = 0. Thus suppose that 
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T €, Hom(M, B) has Hom(1, w)t = 0. This condition means that wrt = 0. Since 
the given sequence is exact, Theorem 4.42e produces some t’ €,, A with gt’ = T. 
In turn, this says that Hom(1, g)t’ = t. By Theorem 4.42, we have exactness at 
Hom(M, B). 

35. We give the proof only that the splitting of exact sequences as indicated 
implies that P is projective. Thus suppose that a morphism t € Hom(P, B) and an 
epimorphism y € Hom(C, B) are given. We are to produce 0 € Hom(P, C) with 
t = wo. Let (W, W,7) bea pullback of (Ww, 7). Then TW = WT, and Proposition 
4.40 shows that v is an epimorphism. Then it follows that 


Ae kerak v 
0 — domain(ker w) = W oy P->0O 


is exact, and it must split by assumption. Thus there exists p ¢ Hom(P, W) with 
wo =Ip. Puto =Tp. Then fo = WTp = tWp =T1p =T, Aas required. 


Chapter V 


1. If € is aroot of F(X), then the given formula shows that D(&) is —23 and —31 
in the two cases. These contain no square factor and therefore equal Dx in the two 
cases. 

2. For (a), let G(X) = F(X + 3) = X37 -— 4X + 3. Then F(X) and G(X) 
have the same discriminant, and the discriminant for G(X) is given by the formula 
of Problem 1. It is —44. 

For (b), let x =a+bdé + c& be given with a, b,c all in {0, 1}. The matrix of 
left-by-x in the ordered basis (1, &, € *) works out to be 


a —2c —2b—4c 
boa —2¢ ‘ 
c b4+2c a+2b+4c 


a? + 2a7(b + 4c) + 4c? — 2b(b + 2c)? + 4ac(b + 2c) + 2bce(a + 2b + 4c). 


and the determinant of it is 


For x to be twice an algebraic integer, this determinant, which is the norm of x, has 
to be = 0 mod 8. All the terms are even except possibly the first, and thus a has to be 
even. That is,a = 0. The determinant then reduces to 4c? —2b(b+2c)* +4bc(b+2c). 
All terms here are divisible by 4 except possibly —2b”. Thus b must be even. That 
is, b = 0. The determinant reduces in this case to 4c}. For this to be divisible by 8,c 
must be even. That is, c = 0. Proposition 5.2 consequently says that a further factor 
of 2? cannot be eliminated from the discriminant. 
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3. For (a), Theorem 5.21 and the remarks after it show that every equivalence 
class contains an ideal whose norm is < (0.283) Dy¢ . Proposition 5.8 shows that 
Dx = 3° = 243. Thus every equivalence class contains an ideal with norm < 4. 

Conclusion (b) is immediate from Theorem 5.6 with F(X) = X?—3. Conclusion 
(c) follows because (</3 — 1)(V9+ V34+1) = (/3)3—1 = 3-1 = 2. Conclusion 
(d) is immediate from Proposition 5.10d. 

For (e), any nonzero ideal is the product of powers of prime ideals associated with 
the various prime numbers. The ones corresponding to the prime numbers 2 and 3 are 
principal ideals by (b), (c), and (d). These are the only ones that need to be checked, 
according to (a). Thus every nonzero ideal is principal. 

4. Conclusion (a) is immediate from Theorem 5.6, since X 3 _7 factors modulo 2 
as (X + 1)(X? + X + 1). For (b), we show that no element x = a + bV7+cV/49 
has norm +2. Left multiplication by x carries 1 toa +b V7 + ¢+/49, carries \/7 to 
TotaV7T+ bX/49, and carries ¥/49 to 7b + TcA/T + ax/49. Thus its matrix is 


a Tc Tb 
bac]. 
cba 


The determinant is a* + 49c? + 7b? — 21abc, which is congruent modulo 7 to a?. 
Modulo 7, the cubes are 0 and +1, and thus the congruence a? = +2 mod 7 has no 
solution. 

5. Since the element /—1 + /—5 has degree 4 over Q, the minimal polynomial 
has degree 4. The product of (X — (+./—1 + /—5)) and the Galois transforms 
(X — (4/1 — /-5)), (X — (-V=-1 + V-5)), and (X — (--/=1 — V—5)) is 
X* 4 12X? + 16, which is in Z[X]. 

6. The minimal polynomial of € = 5(./—1+/—5) is H(X) = X*42°-712X7+ 
2-416 = X44 3X? 41 with |D()| = |Nxo(H’(é))|. Here H'(X) = 4X3 + 
6X = 2(2X? + 3). Since 4 + 3&7 + 1 = 0, we have &* = —3 + 5/5; thus 
267 +3 = +J/5. So |D(é)| = |Nijo(+2V/5)|. The four conjugates of /5 are 
+4/5 twice and —4/5 twice, and the norm is the product of the four conjugates. Thus 
|DE)| = |Njo(+2V5)| = 2457. 

7. These follow immediately by applying Theorem 5.6 to the indicated prime, 2 
or 5, and the respective polynomials: X* +5, X? + X —1,and X? +1. 

8. With Q C K’ CL, the (e, f, g) for L/Q has to be entry by entry > the triple for 
K’/Q. The triple for K’/Q is given in Problem 7b as (1, 2, 1) for p = 2. Similarly 
from Q C RK” CL, the (e, f, g) for L/Q has to be > (2,1, 1). Thus e > 2, f > 2, 
and g > 1. Since efg = 4, equality must hold throughout: (e, f, g) = (2, 2, 1). 

This proves (a). Similarly for (b), we must have (e, f, g) > (2, 1, 1) and (e, f, g) 
> (1, 1, 2). Thus (e, f, g) > (2, 1, 2). Since efg = 4, (e, f, g) = (2, 1, 2). 

9. In (a), Problem 8a shows that (2)T = P*, and we know that (2)R = 95. Then 
P? = (2)T = (2)RT = e5T = ((2T)(~2T). Since P is prime, P divides fo2T. 
For the equality P2= (oT)? to hold, we must have P = go2T. 
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Similarly (5)T = P?P} and (5)R = 3. Then P?P3 = (5)T = (5)RT = 
p3T = (95 T)?. Since P; and P> are prime, P; and P must divide 057. Therefore 
P,P. = gs5T. 

In (b), conclusion (a) shows that no prime ideal of R that divides (2)R or (5)R 
ramifies in T. Since D(&) is divisible by no prime numbers other than 2 and 5, 
Theorem 5.6 shows that no prime ideal (~) of Z ramifies in T. Hence no prime ideal 
of R containing such a prime (p) of Z ramifies in T. 


10. Roots of unity must map to roots of unity under the embedding, and there 
are only two roots of unity within IR. Hence there are no real-valued embeddings 
when p > 2. Thus the embeddings come in complex-conjugate pairs. The product 
0 (x)o(x) is positive for x > 0, and Nx/g(x) is the product of these expressions over 
all such pairs. 


11. For (a), F(X) is the minimal polynomial of c* when GCD(k, Pp) = 1. Then 
c* — 1 is aroot of G(X) = F(X + 1) of the correct degree, and therefore G(X) is 
the minimal polynomial of ¢* — 1. If H(X) is the field polynomial of an element 7, 
then Nx/g(n) = (—1)"21H (0). In this instance [K : Q] = p — 1 is even. Taking 
n = ¢* — 1, we obtain Nxjo(e* —-1)=G(Q0)= FQ) =p. 

For (b), ¢ — 1 divides ¢* — 1, and hence the quotient is in R. If / is chosen with 
Ik = 1 mod p, then ¢ —1 = ¢’* — 1, and ¢* — 1 divides ¢’* — 1. Therefore the 
reciprocal of (¢* — 1)/(¢ — 1) isin R. 

12. With F(X) and G(X) as in the previous problem, F’(¢*) = G’(¢* — 1). 
Here F(X) = (X? — 1)/(X — 1) makes G(X) = X~![(X + 1)? — 1] and G(X) = 
X~[pX(X +. 1)?-! — (XK +1)? +1]. Since ¢” = 


F'(g*) = G'(g*-1) = (¢8§ -1) 7 [peek -Dgke-Y —c 41] = (oF -1)7! peke?, 


The result now follows from the formula D(c*) = F(t ky. 
13. Continuing from the previous problem gives 


Nxjol(F (¢*)) = Nxjo(o* — 7! p? Mg (GkO-P) = p?-?. 


The result follows from the computation (—1)?-P?-?/? D(c*) = Nxjg(D(g") = 
Nx/o(F'(¢*)) = p?-?. 

14. For (a), we have Ak = (1— 6) = aay (-1 De! and ¢* = (1—A)k = 
4 (-1)/ (‘) 44. Conclusion (b) is a version of Problem 1 1b because the conjugates 
of ¢ are the powers ¢/ for 1 < j < p—1. For(c), we have p = ge ad—-¢y)= 
Tot Gd —)ue = 0)?! Po ug, where uz = (1 —¢*)/(1— 2). Each element 
ux is a unit by Problem | 1c, and (c) follows. 


15. The identity (p)R = (1 —- eyes is immediate from Problem 14c. The 
extension K/Q being Galois, we know that the prime decomposition of the ideal 
(p)R is of the form (p)R = P/--- oe where p — | = efg and f is the common 
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value of all dimp,(R/P;). This latter fact says that no factorization of (p)R into 
proper ideals can have more than p — | factors, and p — | factors occur only if all 
factors are prime. In this case, (1 — ¢) is a proper ideal because Nx/g(1 — ¢) = p. 
Thus each factor (1 — ¢) is prime. 


16. Following Proposition 5.2, suppose that a; is an integer for each j with 
s <j <k such thatO <a; < p—1,a,#0,aq, = 1, and 


asd + ayast* + agast? +... +ay_ya*! + aa* = pr 


with r in R. Subtracting all terms from the left side but the first and applying 
Problem 15 shows that a,A° lies in (A)°+!. Thus (a,)(A)® € (A)**1. Canceling gives 
(as) © (A), and this inclusion is a contradiction because GCD(N ((a;)), N((A))) = 1. 


17. Each step toward a Z basis multiplies a discriminant by a square, and it is 
enough to prove that a primitive element € for K/Q lying in R has sgn D(é) = (—1)”. 
We are thus to compute the sign of Hie; (o;(€) —9; ())?. For a given pair (7, j), the 
factor (0;(€) — oj (& ))? is matched by its complex conjugate elsewhere in the product 
unless o; and o; are both real or are complex conjugates of one another. The factor 
and its mate have a positive product, and pair with o; and o; both real contributes a 
positive square. If o, = o;, then o;(€) — o;(&) is purely imaginary, and its square is 
negative. Hence the sign is (—1)’”. 

18. Let g be in Gal(K/Q) = {o1,...,0,}. Replacing each oj by go; has the 
effect of permuting the columns of [o;(a;)]. If the permutation is even, then the terms 
contributing to P are the same before and after the permutation; otherwise they are 
interchanged. In either case, P + N and PN are fixed. Since P + N and PWN are 
fixed by the Galois group, they are in Q. The entries oj(a;) of the matrix are in R, 
and thus P and N are in R. Consequently P + N and PN are in Z. The formula 
D(T) = (P + N)* —4PN shows that D(T) = (P + N)? mod 4. Any square of a 
member of Z is congruent to 0 or 1 modulo 4, and the result follows. 

19. Let J be an ideal of S~'!R. Proposition 8.47 of Basic Algebra shows that 
I = RQ J isan ideal in R and that J = S~!7. Since h,..., 1, isa complete set 
of representatives for the equivalence classes,aJ = bI; for some j with 1 < j <h. 
Let (a)s and (b)s be the principal ideals of § ae generated by a and b. The fact that 
u isin J; 1 S means that S4y — SR, and thus 


@)sJ = St@s tr =Sst@r=Ss 1) 


1 1 1 1 (*) 
= S71(b)S“'1, = S“'()S“!R = (b)s. 


Hence J is principal. (In fact, the equality shows that aj = b for some j € J. 
Hence ba~! = j is an element of J C S~!R, the principal ideal (ba~!) 5 of S~!R is 
meaningful, and (ba~!)s C J. For the reverse inclusion let j € J be given, and use 
(«) to write aj = bx with x € S~'!R. Then j = (ba~!)x shows that j is in (ba~!)s, 
and J C (ba~!)s.) 


Chapter V 685 


20. For (a), write ab = u*. Then a~! = u~*b exhibits a~! as in S~!R. For (b), 
if u~""a is aunit in S~'R, then u~"a~! = u~'c for some c € R. Hence ac = u'—”. 


Since ac is in R and u is not,! —m =k with k > 0. Then a divides u*. 


21. For (a), write (v)= Pf! --- Py’. Then (u*) = (Phys 200 (PPA = (bf! --- by"). 
Thus wu? = bi! ee by € for some unit ¢ in R, each b; divides u", and the conclusion 
follows from Problem 20a. 

For (b), we have (a)(b) = (u)* = Pi*!.-. PK. Since a and b are in R, this 
equality implies that (a) = Pe '...P'. For each j, use the division algorithm to 
write rj = njh +t; withO < 1) <h. Then P! = (P")"P! = (b))"P;', and 
consequently (a) = (d PP vee Pe as required, where d = em bi : 

The argument for (c) was given in parentheses at the end of the solution of 
Problem 19. 


22. Because of Problem 21d, we now have (a) = (d)(c;). Thus a = dc;eé for 
some unit ¢ in R. Since ué = ab = c;dbe, c; divides u* and is a unit in S~!R by 
Problem 20a. 


23. Problem 22 shows that any unit of S~!R is a product of a power of u by a 
product WS bi , an element c;, and a unit ¢ of R. Problem 21a shows that each b; is 
a unit in S~' R, and Problem 22 shows that each c; is a unit in S~!R. Thus (S~! R)* 
is generated by u, the finitely many elements b; and c;, and a finite set of generators 
of R*. (The group R% is finitely generated by the Dirichlet Unit Theorem.) 


24. G(4/&) = (648-9 — 16-7 + 8E-! + 8) = BE-FEF + E-? — 26 +8) = 
8&-3F(E) = 0. The element 7 is in K, and it is exhibited as the root of a monic 
polynomial in Z[X]; therefore it is in R. 


25. For (a),0 = F(é)/Eé = € +€ —24 87! = & +E —242n. For (b), 
0 = G(n)/n = n? —n+248/n = 1? —n +2 +28. Solving the first equation for &? 
gives the first formula in the table, and solving the second equation for 7? gives the 
second formula in the table. The formula 7 = 4 is immediate from the definition 
n = 4/€. The formulas in the table together show that any integer polynomial in € 
and 7 reduces to a Z combination of 1,£, and 7. 

Conclusion (c) is clear. For (d), we have 7 = | — 5 (&? + &), and this is not in 
Z({1, €, €7}). For(e), we have D((1, &, €”)) = —27-503. Since the only square factor 
is 2”, it follows that Z({1, é, &7}) has index 2 in Z({1, &, n}) and that D((1, €, )) = 
—503. This latter discriminant is square free and thus cannot be reduced further. 
Therefore Dk = —503, and {1, &, 7} is a Z basis of R. Finally the formula 7 = 
1 — 3€7 + €) shows that Z({1, &, n}) = Z({1, €, 3? +). 


26. Application of g to €? = & + 2 — 2n gives a = &. Similarly 7* = 7. The 
elements of a finite field of characteristic 2 fixed by the squaring map are 0 and 1. 
Hence € and 7 are in {0, 1}. Since F = g(R) is generated by the values of g on 1, 
&, and n, F has two elements. From 7 = 4, it follows that £7 = 0. Thus & and 4 
cannot both be 1, and the only possibilities are the ones in the table. 
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27. Define gy : R > F» on & and n by one of the lines of the table of Problem 26, 
and set g(1) = 1. Then ¢ extends to a well-defined additive homomorphism on 
Z({1, &, n}). We have to check that g respects multiplication. It is enough to do so 
on additive generators. Thus we have to check that p(& y= (p(E ))?, that y(n?) — 
(y(n))’, and that p(n) = (g(€))(g()). Thus, for example, in the first one we 
want —g(é) + 29(1) — 2¢(7) = (y(é))*. If we write the values of ¢ as triples 
corresponding to the three possible g’s, the left side is —(0,1,0) + 21, 1, 1) — 
2(0,0, 1) = (, 1,0) mod 2, while the right side is (0, 1,0) = (0, 1,0) mod 2. 
These match, and this relation is verified. The other two relations are verified in 
similar fashion. 


28. The norm of a kernel equals the number of elements in the image of the 
homomorphism, which is 2 in each case. Since each ideal has prime norm, the 
ideal is prime. Moreover, these ideals contain (2)R and hence all figure into the 
prime factorization of (2)R. On the other hand, we must have ) > e; fj = 3 for 
the decomposition, and we have seen that there are at least three terms. So there 
are exactly three terms, and we must have e; = f; = | in each case. Therefore 
(2)R = PooPioPo.. 

29. For (a), the elements listed are additive generators of the ideal in each case, 
and hence they are also ideal generators. For (b), 7 = n(€ + 1) — 2-2 shows that 
n is in the ideal (2,6 + 1). Thus 2,6 + 1,) C @,&+ 1). The reverse inclusion 
is clear. In (c), the argument for (2, 7 + 1) is completely symmetric. Let us see that 
(2, &,) = (2, € — n). The inclusion 2 is clear. For the inclusion C, we use the two 
formulas 


(She ghey = 25 2n Heh 2) a Ss, 
3+ §)2+ (—my& —n) = 64+ 26 —44+ (—2§ —2+) = 70. 


30. For (a), the field polynomial of 6 — q is H(X + q), and so the norm of 6 — q is 
—H(0+4q),as required. In (b), the first two formulas come from the field polynomials 
F(X) and G(X) of & and n, and the other formulas follow from (a). 

In (c), the fact that N((€)) = |Ni/g(&)| = 8 shows that the prime factorization 
of (€) is into prime ideals whose norms are powers of two. Problem 28 shows that 
all such ideals have been identified, and thus (€) = PS. Pt. oP for some exponents 
> 0. Comparing norms shows that a + b + c = 3. Similar remarks apply to (7). 

In (d), use of Problem 28 shows that Pj) P? Pp; = (2)R)* = AR = (€)(n) = 
Poo: Pe Py’. Thena+a = 2,b+8 =2,andc+y = 2 by unique factorization. 

For (e), we observe from the kernels, or else we see from Problem 29a, that € is not 
in P; 9 and that 7 is notin Po,;. Hence P; 9 does not appear in the prime factorization of 
(€), and Po, does not appear in the prime factorization of (7). Therefore b = y = 0. 

For (f), the results of (e) and (d) combine to show that a+ a@ = 2, B = 2, and 
c=2.Sincea+c=3anda+6 =3,a=a=1. 

31. For (a), we see immediately from Problem 29a that € + / lies in Pj,9 but 
not in Poo and not in Po. For (b), the formula |Nx/g(& + 3)| = 2? shows that 
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(€ + 3) is the product of exactly two of the prime ideals of norm 2; thus (a) implies 
that (€ + 3) = PP. Similarly |Nx/q(€ — 1)| = 2°, and (a) gives (€ — 1) = P}o. 
Conclusion (c) is immediate from Problem 29a. 

For (d), we have (2)R C (2,€&); thus (2, &) is of the form PoP rhe y with 
a+b+c <3. Since & is not in Pj,.9,b = 0. Since & is in Poo and Po,1, we must 
have a > 0 andc > 0. Since the inclusion (2)R C (2, €) is proper (because é is not 
in (2)R = 2Z({1, é, n})); N((2,&)) <4. Thus a = c = 1, and (2,€) = PooPo.. 

For (e), Problem 29a shows that Po; = (2,§,7 + 1). Thus PS contains 4 and 
&(n +1) = 4+6, hence &. If fate contains also € + / with 1 = 2 mod 4, then it 
contains € + 2, hence 2. This would mean that a > (2,&) = Poo Po,1. Since Pi 
and Po,9 Po,; both have norm 4, they would have to be equal, and we would obtain 
Po.1 = Poo, contradiction. 

For (f), Problem 30b gives N((é + 2)) = 8. In view of (c), (§ +2) = Poo? 6.1 
witha +c =3andc > 1. Part (d) shows that c < 1. Thus (6 +2) = Pe Poa The 
argument for (€ — 2) is similar. 


32. For (a), this kind of argument is done in a parenthetical remark at the end of 
the solution of Problem 19. For (b), we have ( + 2) =r, Po,1 and (§ —1) = P§) = 
(€ + 3)P1,9. Thus the same kind of argument shows that Po; and Po are principal. 

For (c), we factor X 34. x2 —2X +8 modulo 3; there is no root in F3, and hence 
the reduced polynomial is irreducible. By Theorem 5.6 the only prime ideal whose 
norm is a power of 3 has norm 3°. 

For (d), we factor X34 .X?—2X +8 modulo 5 as (X+ NTS. Ge — 2),and Theorem 
5.6 gives us one prime ideal of norm 5 and one of norm 57. The one of norm 5, 
according to the theorem, is (2, | + &). For (e), the technique of Problem 30a shows 
that N((1 + &)) = 10. Thus the only possibility for the prime factorization of (1 + &) 
is as (2, 1+ &)P, where P is one of the three ideals of norm 2. For (f), since (1 + €) 
and P are principal, (2, | + &) is principal, by the same technique as in earlier parts. 

For (g), the prime factorization of nonzero ideals allows us to conclude that every 
nonzero ideal of norm < 6 is principal. Application of the technique after Theorem 
5.21 shows that every ideal class has a representative with norm < 6.35, hence norm 
< 6. All such ideals are principal, and therefore R is a principal ideal domain. 


Chapter VI 


1. Apply the Cauchy criterion. Since |a, + dyj41 +--+ + Am\, < MaXxpn<p<m Idkl p> 
the series is Cauchy, hence convergent, if and only if the terms tend to 0. 

2. In (a), the equality GCD(3, 2”) = 1 implies that there exist integers x, and y, 
such that 3x, — 2”y, = 1. Then x, — ; = 2"3-ly,. Applying the 2-adic absolute 
value gives |x, — zl> = 2™"lyaly < 2~", and this tends to 0. For example take 
Xn = x(2"-! +1). In (b), the argument with ¢ replacing ; is similar: to get 
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Ix — $l, < 2-", start by finding x and y with bx — 2”y =a. 

3. Write ideles as tuples indexed by oo, 2,3,5,.... If qg is in Q, then v(g) = 
(4,9,9;,9,..-). If this is to be in R* x lly Z/, . then the only restriction on the first 
coordinate is that g # 0, but the other coordinates are restricted by |q|,, = 1 for all 
primes p. This means that g in lowest terms has no p in either the numerator or the 
denominator. So g = +1. This proves (a). 


In (b), let (%90, X2, X3,...) be in I. Since IXpl, | for only finitely many p, there 
-1 


exists a unique positive rational q such that |g|,, = |xp|,, for all p. Define zp = xpq 
as a member of Q). Then |zp|, = eolvlgle = | shows that |zp|,, = 1 for all p. 
Finally define r = ear! as a member of R*. Then (r, z2, z3,...) is in 1(S.), and 
(Xoo, X2,.X3,---) = (9,9,q9,---)(, 22, 23, +--+). 

4. In (a), the norm of the ideal divides the norm of any element, and if the 
norm of the ideal is prime, then the ideal is prime. With K = Q¢./—5 ), we have 
Nxjo( +/—-5) =6, Nx/g(3) = 9, and Nx/Q(2) = 4. Therefore N((1+ /—5, 3)) 
divides GCD(6, 9) = 3, and N((1 + V—5, 2)) divides GCD(6, 4) = 2. One checks 
that these ideals are not all of R, and then the respective norms are 3 and 2. So 
the ideals are prime. In (b), (1 + J/—5) = (1+ /—5, 2) + /—5, 3), and (3) = 
(1+ /—5, 3)(1 — /—5, 3). 

In(c), ;U+/—5)R = (14+/-5, 2)1+/-5, 3)(1+/-5, 3) (1-5, 3)! 
= (1+ /—5, 2)(1 — /—5, 3)7!, and (1 + /—5, 3) ae not appear. 

In(d), lev=5 = 25) = 2047-5) 

@py-sid-=V=5) 1 Te 


5. The mapping gy : 1+ P” + P"/P"*! induced by 1+x tH x + Pl"! is 
a homomorphism from 1 + P’’ under multiplication into P?’/ pet under addition 
because the equalities g(1 +x) =x+ P”*!,g1+y)=y+ P"*!, and 


g(1+x)1+y)) =e +x+y4xy) 
=xtytaxyt Pl axty+ Pt 


show that g((1 + x)(1 + y)) = g(1 +x) + g(1 + y). The kernel of ¢ is the set of 
all 1 +x withx € P”*!,ie.,1+ P”*!, and the image is certainly all of P”/P?*!, 

6. The composition I! /.(K*) — I/u(K*) — Z/P induced by the inclusion 
I! —> Land the passage from I to Z discussed in Section 10 is onto Z/P because the 
composition is affected by only the nonarchimedean places and because any member 
of I can be adjusted at the archimedean places so as to be in I'. In addition, the 
composition is continuous if Z/P is given the discrete topology. Since I! /1(K*) is 
compact, the discrete space Z/P has to be compact and must be finite. 


7. Fix a finite subset S of places containing S,.. Then the projection of [],,.5 K 
to K* is continuous for each v € S. Since also the inclusion K;* > K, is continuous, 
the composition [],,.5 Kx —> K, is continuous. Thus the corresponding mapping 
Tues Ki > [Ives Kw is continuous. In similar fashion [],,25Z7, > Z, is a 
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continuous function as a composition of continuous functions. Thus [T,, gs Ly > 
Thee s Zy is continuous. Putting these two compositions together shows that Ix (S) > 
Ax (S) is continuous, and therefore Ix (5) — Ax is continuous. Since this is true for 
each S, it follows that Ix — Ax is continuous. 


8. Each x, lies in Ag(S.o), which is an open set in Ag. For each prime p, 
Xn,p = 1 if n is large enough, and also Xn,o0 = 1 for all n. Since Ag(Soo) has the 
product topology, {x,} converges to (1). On the other hand, if {x,} were to converge 
to some limit x in Ig, then x would have to lie in some 1(S), and the ideles x, would 
have to be in I(S) for large n. But (x,,,) is not in I(S) as soon as v is outside S. 


9. For fixed g in G, we have d(®(gx)) = d(®(g)®(x)) = d(®(x)), and hence 
d(®(-)) andd(-) are Haar measures on G. Any two Haar measures are proportional, 
and the result follows. 


10. In (a) the equality is trivial if cjc2 = 0. When cjcz2 4 0, we have d(cic2x) = 
|cyc2|, dx and also d(cjc2x) = |ci|pd(c2x) = |ci|,lc2|, dx, and it follows that 
lc1C2| 7 = |c1|,|C2|, in this case as well. 

The proof of continuity is harder (but is essential to make sense out of (b)). We first 
check continuity at each co # 0. Let f bea continuous real-valued function vanishing 
off a compact set S, and let N be a compact neighborhood of co not containing 0. If c 
isin N, then f (c~!x) is nonzero only for x in the compact set NS. Let e > Obe given. 
Continuity of (c,x) bh f (c7!x) allows us to find, foreach x in NS, an open subneigh- 
borhood N, of co and an open neighborhood U,. of x such that |f(ely) _ f(co'x)I < 
€ forc € Ny, and y € U,. Then |f(e7ly) _ f(co'y)l < 2e forc € N, andy € U,. 
The open sets U, cover NS. Forming a finite subcover and intersecting the cor- 
responding finitely many sets N,, we obtain an open neighborhood N’ of co such 
that | f(c~ly) — f(c'y)I < 2e forc € N’ whenever y is in NS. As a result, 
cre fy f(c-!x) dx is continuous at c = cy. Therefore c + |cly io Ff @)dx is 
continuous at co, and so isc F |cly. 

To prove continuity at c = 0, we are to show that lim,_,9 1 f(c7!x) dx = 0 for 
f as above. Let U be any compact neighborhood of 0 in V. Find a sufficiently small 
neighborhood N of 0 in V such that c € V implies that cS does not meet US. Then 
c7!U° NS = @. For such c’s, we have | f, f(c7'x)dx| = | fy f(c7!x) dx| < 
If ll sup (dx(U)), and the desired limit relation follows. 

For (b), we have d(cx)/|cx|r = (lelr dx)/(lclr|xlr) = dx/|x|r. For (c), |x|- = 
|x| if F = R, and |x|p = |x|? if F = C. For (d), |x|p = |x|, if F = Q,. For (e), 
we have J = pZ,, and therefore the Haar measure of J is the product of |p|, = po 
times the Haar measure of Z,. Hence the Haar measure of / is poh 

11. If F has characteristic p’ 4 0, then the sum 1 + --- + 1 with p’ terms is 0 
in R, and it must be 0 in R/p. So R/p must have characteristic p’. Thus any such 
p’ £0 must be p. 

12. In (a), apply Corollary 6.29 with f(X) = X¢~! — 1 in R[X]. Every nonzero 
a is a simple root of the reduced polynomial f(X) = X?~! — 1 in F,[X], simple 
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because (q — 1)(@)?~! 4 0. The corollary produces a root a of f(X) whose image 
in R/p is a. In this way we obtain gq — 1 distinct roots of 1 in R, each corresponding 
to a different coset in R/p. Together with 0, these exhaust the cosets of R/p. 

In (b), if F has characteristic p, then raising to the p" power is a field mapping 
of F into itself. Since g = p”, raising to the q™ power is the m-fold iterate of 
a field map and is a field map. If a and b are two (q — 1)* roots of 1 in R, then 
(a+b)4 = a4 4+ (4b)4 =a + (+b), and soa +b isa (g — 1)" root of 1. Since the 
nonzero elements of E are closed under inverses, EF is a subfield. 


13. In(a) let x bein R. Problem 12 produces a unique ap € E with x —ag inp, i.e., 
with v(x — ao) > 1. Then v(t"(x _ a)) > 0, and Problem 12 produces a unique 
a, in E with tl — do) — a; in p. Continuing in this way, we obtain do, ..., ay in 
E with 


ttaol--@7'@ — ao) — a1) — +++) — ay-1) — ay 


in p. Thus v(x — Ss axt*) > N +1. Since F is complete, 7729 axt* converges 
with sum x. The statement about the value of v is clear. 

In (b), the part about the series giving an element in R is immediate from Problem 1, 
since r* has limit 0. The operations on R now match those on F,, [[t]], and the isomor- 
phism follows. For (c), let x be given withx ¢ R. Set v(x) = —N. Then v(t" x) = 0, 
and we can apply (a) to write *Nx = )°P° 9 axt*. Thenx = 779 agt*- ,as required. 

14. In (a), the inclusion of the integers into R, followed by passage to the quotient 
R/p, is an additive homomorphism. Since R/p has order g, g must map to the 0 
coset, namely p. 

Part (a) shows that v(q) > 1. Since v(q) = v(p”) = mv(p), v(p) is positive, and 
(b) is proved. The same argument as in the proof of Ostrowski’s Theorem shows that 
v(p’) = 0 for all prime numbers other than p, and then (c) is immediate. For (d), it 
is enough to check equality of the absolute values in question on the element p, and 
for that we have eee = gpm) — g-lim — p-) 

For (e), the map of Q’ to Q, when composed with the completion Q > Q,, isa 
homomorphism of valued fields into a complete field. It therefore extends uniquely 
as a homomorphism of the closure @Q into Q,. The dense set Q’ maps to the dense 
set Q, and hence the extended map is an isomorphism. 

Part (f) is just a repetition of the argument in Problems 13a and 13c. In (g), let 
i= a,t* be the expansion of f , and put oS yee axt,. Since v(t) = 1, we 
obtain v(x — cj) = v(t”) = vou(t) = vo. Therefore v(p lx — Cj))) = O. Iterating 
this procedure as in Problem 13a, we obtain a convergent expansion x = )7P-9 Cj pr 
For (h), we then have x = 7?) cj, pk = sy Cj ike) p* and we see that x lies 
in see Qc;. Therefore dim[ F : Q’] < J. 

15. Part (a) is immediate, and (b) follows from Theorem 6.33. For (c), R/p 
corresponds to extracting the constant term from a power series int, and thus L/so = 
F,s is of dimension f over R/p = F,. The computation oT = tUT =tT =tRT = 
pT = P* shows that K/L has ramification index e. For (d), each index (residue class 
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degree and ramification index) for K /F is the product of that index for K/L and that 
index for L/F’. So e for L/F is 1, and f for K/L is 1. 

16. For (a), the irreducible polynomial g(X) has to be separable, and therefore all 
of its roots in kx are simple. Application of Hensel’s Lemma in the form of Corollary 
6.29 produces a. For (b), the polynomial g(X) is monic with coefficients in R, and 
its root a is therefore a member of L integral over R. Thus a@ lies in U. The natural 
field map U/g — T/P takesu+ g tou+ P,hence takesa+ toa+ P =a. Thus 
we can regard @ as a member of k;,. Since kr and @ generate kx by construction of 
a, kp = kr. 

For (d), let us use subscripts on the indices e and f to indicate the field extension 
in question. Then we have er frjrp = [L : F] = degg(X) = degg(X) = 
[kx : kr] = fx ;r on the one hand and fx ;p = [kx : kr] = [kr : kr] = fryr on 
the other hand. The two chains of equalities together show that e,;- = 1, and the 
second one in combination with fx;r = fx/ifijr Shows that fx/, = 1. 

17. In (a), the element y; exists and is unique because of the nondegeneracy of the 
trace form, which holds because K /F is separable (Theorem 8.54 and Section [X.15 
of Basic Algebra). 

In (b), the expression for the z;,’s in terms of the y;’s shows that Bice Rz S 
et Ry;. The assumption det A = +1 implies that B = A7! lies in M,(R). Since 
Yj = Vy BejZe, we obtain Y4_) Ry; S Depay Rzx- 

For (c), it is evident that the degree is at most —1. Write g(X) = II, (x —&;). The 
opening computations of Section V.4 show that g’(é;) = [] isi (§ — §)). Therefore 
the value of the left side at &, for the identity in question is 


n TT jai & = §;) 
i=l ini Gi 57) 


The numerator is 0 unless i = k. Thus only the i" term makes a contribution, and its 
value, namely |, matches the value of the right side. Then (d) is a routine computation. 

For (e), the rational expression (1 + cyX +--- +c,X")~! on the left side is 
expanded in series using (1 + Z)>! = 1—- Z + Z? — Z> + ---. Thus the left side 
is the sum of X” and a series beginning with a multiple of X”*+!. The right side is 
peo TK /F (gg yr lee aca) , and the conclusion of the problem results by equating 
the indicated coefficients. 

For (f), the result of (e) handles the entries with i + 7 < n+ 1. For those with 
n+2<i+j <2n, we write &'+/-79/(E)—! as E"E'+J-"-2.9'(E) |, substitute for €” 
recursively from the field polynomial, and check that the traces are in R by applying 
(e). Thus all Aj; are in R. 

For (g), conclusion (f) shows that A is triangular with 1’s on the off diagonal, and 
hence the determinant of A is +1. Put z, = DF Ajxyj. Since x; = ert, 


Trxjr(Zexi) = 0; Aja Tex sr (yjxi) = Aix 
= Trejr((g' (6) ee!) = Tree ((g' (&) EK) x). 
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Therefore z, = g’(&)~'é*-!. Combining this equality with (b) shows that N= 
Dj Ry = Ly Rew = Ly RBG)! = BE) IN. 

18. For (a), the assumption f = n makes dim, (kx) = n. Thus deg g(X) = 
deg g(X) = n. Since g(X) is irreducible, so is g(X). The root a of g(X) in K is 
such that F(q@) is an n- -dimensional subspace of K , hence equals KL 

For (b), the conclusion N =) T follows from the definition. Since T = DK /F)~ - 
we obtain D(K /F)~! ¢ N= g'(a)!N C g(a) !T. 

For (c), the polynomial g(X) was constructed as irreducible, and g(X) was con- 
structed to reduce to g(X). Then 2 (@) ¥ 0, and it follows that g’(@) is in T but not 
P. Thus g/(a) is a unit in T, and g’/(a)-!'T = T. Then D(K/F)~! C T. Since 
D(K/F)~! > T also, D(K/F)~! = T, and D(K/F) = T 


19. For (a), we may assume that v(x}) < v(x;) for j > 1. If v(x) < v(x;) 
for all 7 > 1, then induction and use of property (vi) of discrete valuations shows 
inductively that vO) = v(x) +--+ +Xm) = v(x1), contradiction. 

For (b), the element z is in 7, and its minimal polynomial has coefficients in R 
because T is integral over R; in turn, the field polynomial is a power of the minimal 
polynomial. Since c; is in R, we have vx (c;) = nup(c;), and therefore vx (c;) is 
divisible by n. 

For (c), apply (a) to the equality coz” + c)"~!+---+c, = 0 to produce indices 
i < j with v(¢jx"") = v(cjx"/) and with v(c.x"*) > v(c;a") for all k. The 
equality involving i and j implies that j —i = vx(cj) — ux(c;). Fromi < j <n, 
we have n —i > 0. Thus v(e;2""!) > v(e;z) > 0. By (b), v(e;a"') > n. So 
v(cy"—*) > nn. 

In (d), the right side of the equality 7 — i = vx(cj) — vx (c;) is divisible by n, 
by (b), and the left side is between | and n. Hence the two sides equal n, and we 
conclude that i = 0 and 7 = n. Thus the equality says that n = ux (cy). Since Cp is 
in F and since vy = nur, Ur(C,) = 1. Therefore c, is in p but not p- The inequality 
ve (cyn"*) > 1 implies that vx (cy) => k. For 1 < k <n, this conclusion implies 
that vx (cx) > 1. Since cx is in F and since vx = nur, vr (cy) > O fork > 1. Thus 
cgisinp fork > 1. 

In (e), the irreducibility is immediate from the Eisenstein irreducibility criterion, R 
being a principal ideal domain. Since the field polynomial is a power of the minimal 
polynomial, the field polynomial equals the minimal polynomial. Then the degree of 
F(z) isn. Since F(z) is an n-dimensional subfield of the n-dimensional field K , 
K= F(a). 

Part (f) is proved in the same way as Problem 14g. For (g), the expansion can be 
rewritten as ) 29 dk Yk = Dio Moc jac Mit Veit) = Voc jee (Lixo aei+j4'). 
The term in parentheses is the most general member of R, and the left side is the most 
general member of T. Thus (g) follows. 

In (h), conclusion (g) shows that N = Sc, -0 Rx‘ equals T ,and Problem 17 with 
€ = 7 shows that N = g'()-!N. Thus D(K /F)~! = =T= g’(r)~'T. Multiplying 
by (g/(1))D(K /F), we obtain D(K /F) = (g'(z)). 
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For (i), g/(t) = em?! 4 PT) cy_gko*! = en?! + b. In each term of b, 
vu (kCa_k) > CUp(Cn_k) = e, and vg (ak!) = k — 1. Thus vx (b) > e. Meanwhile, 
ux (em®!) = (e — 1) + vx(e). Thus vx (g’(r)) = min ((e — 1) + v«(e), vx (d)), 
and property (vi) of discrete valuations shows that equality holds if the two members 
(e — 1) + vx(e) and vx (b) of the minimum are unequal. If vx (e) = 0, then the 
members are unequal, and we obtain ux(g'(7)) = e — 1. Otherwise, we obtain 
uk(g’(t)) > e. We know that D(K/F) = (g/()) = P%8'™), and Lemma 6.47 
follows. 
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1. If x and y are members of L purely inseparable over K, then x? and yr are 
in K for suitable e and e’. Without loss of generality, let e’ < e. Then x” and y” 
are in K, and hence (x + y)”’ =x? + y” arein K and so are (xy)? = x? y?* and 
(xy~!)P = xP*y-P if y £0. Sox +y, xy, and xy! are purely inseparable over 
K, the last of these if y 4 0. 

2. In view of Proposition 7.10, the given conditions imply that [K(a@) : K] = 
plK (a?) : K] and that X°" — a” is irreducible over K (a”’) for every “ > 0. 
Since a? “ is a root of this polynomial within K(q) for each uw < e, K(a) hasa 
chain of subfields 

Ka”) G K@") S--»S K(a?) S K(@) 

in which the consecutive degrees of the extensions are all p. Let 6 be separable over 
K and let K (a” ) be the first of these fields to contain B. Arguing by contradiction, 
suppose thatr < e. Then 6 and ae generate K (a?’) because [K (a?’) : K(a?"')] 
is prime. The separability of 8 over K implies that £ is separable over K (a? ), hence 
that K (a?’ ) is separable over K(a?"'), hence that a” is separable over K(a?""), 
Since (a?’)? lies in K(a?""'), a?’ is also purely inseparable over K(a?”"'). By 
Corollary 7.12, w”” lies in K(a?"”'). This contradicts the fact that the above chain 
of subfields is strictly increasing. We conclude that r = e. Hence all elements 8 
separable over K lie in K (a?*), 


3. For suitable integers R,, we form the tuple z = (Rg + aZ)q>1, using the 
realization of the inverse limit in Proposition 7.27. We have to specify the integers 
R,. The condition for z to lie in Z, coming from the condition fo, o fp = fa when a 
divides b, works out to be that R, — R, is divisible by a whenever a divides b. After 
the integers R, have been defined for all a, it is enough to check that Rpg — Rg is 
divisible by a whenever p is prime. 

For n odd, define Rae, = nk + 1, where k is the unique integer from 0 to 2° — 1 
such that nk + 1 is divisible by 2°. This k exists and is unique because —n has an 
inverse modulo 2°. One checks that Ro-+1,, — R2en is divisible by 2° and by n, and that 
Rx pn — Ren is divisible by 2° and by n if p is an odd prime. The definition makes 
Ry = 0 and R, = | for every odd prime q, and therefore z is not of the form z, for 
any integer c. 
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4. The first part is immediate from Theorem 7.34. For the second part the group 
Gal(R/Q) is trivial. In fact, any member of Gal(R/Q) must fix Q and map squares in 
R to squares. It therefore respects the ordering. For any € R, it fixes each rational 
less than r, and hence it fixes r. 


5. Use Kn = QU/P1,.--,./Pn ), Where py, is the nih prime, and Proposition 7.30 
to see that Gal(K /Q) is an infinite product of groups of order 2. (A problem at the 
end of Chapter IX of Basic Algebra can help with this step.) The open subgroups of 
index 2 correspond to quadratic extensions of Q, of which there are countably many. 
Since Gal(K /Q) has uncountably many subgroups of index 2, such a subgroup H 
exists that is not open. The field extension K /Q is normal, and thus Gal(K /Q) is a 
homomorphic image of Gal(Qaig/Q), say by a homomorphism g. Then go \(H ) is 
the required subgroup of Gal(Qaig/Q). 


6. Suppose / is primary. If b + J is a zero divisor in R/T, then ab is in I for some 
a not in J. Since J is primary, b” is in J for some m. Thus (b+ J)" =b"+T7= 17, 
and b + J is nilpotent in R/T. 

If every zero divisor in R/TJ is nilpotent, then the ideal 0 in R/T is primary because 
whenever (a+ /)(b+ 7) = J anda+J/ # J, then the nilpotence of b + J implies 
that b” + I = I for some m. This says that the 0 ideal 0 + J in R/T is primary. 

Ifthe 0 ideal in R/J is primary and ifab isin /J witha notin /,then (a+J)(b+/) = 
I witha+I 4 I,and hence (b+ 1)” = I for some m,0 being primary in R/T. This 
means that b” is in J, and / is primary. 


7. In (a), if xy is in VT, then (xy)” is in J for some m, and therefore either x” is 
in J or y”” is in J for some n, ie., either x is in JT or y is in V7. 

In (b), let x be in VT, and choose n such that x” is in /. Then x” is in J because 
IC J. Since J is prime, some factor of x” is in J,i.e.,x isin J. 


8. In (b), R/I = C[y]/(y?). The zero divisors of R/I are cy with c € C, and 
(cy)? = 0 in R shows that cy is nilpotent in R. By Problem 6, J is primary. The 
radical P = 4/T is (x, y) by inspection, and this is prime. Since P* = (x?, xy, y), 
we have P? S IJ G P. If J = Q" for some prime ideal Q, then J C Q, and 
Problem 7b shows that ./T Cc Q. Since a/T is maximal in this case, Q has to be P. 

In (c), R/P = K[X,Y, Z]/(XY —- Z?,X,Z) ~ K[Y], and this is an integral 
domain. Hence P is prime. Next, P2 = (x2, xz, 27). Thus xy = 2 lies in P2. 
However, x is not in P?, and y” is not in P* for any m > 0. So P? is not primary. 


9. Let a and b be in R with ab in J and a not in J. To show that J is primary, 
we are to show that b is in /7. We do this by showing that (b) + I C VT. The 
ideal (b) + I is proper, since otherwise 1 = cb + x with x € I, which implies that 
a = cba + xa is in I, contradiction. Let J be a maximal ideal with (b) + J C J. 
It is enough to show that at Cc J; in fact, then VI = J because VI is assumed 
maximal, and (b) +] © I as asserted. So let u be in /7. Then uv” isin J C J for 
some m, and u is in J because J is prime. 

This proves the first part. The second part follows from the observation that if J 
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is maximal, then JJ" =J.In fact, J” contains all elements a” fora € J. So Jyn 
has to contain all elements a € J. Since J is maximal and JJ” has to be proper, 
JP =I. 

10. In (a), let P be a prime ideal, and suppose that P = 7M J nontrivially. If i is 
in J but not J and if j is in J but not J, then ij is in P, but i is not in P because i is 
not in J and similarly j is not in P because j is not in J. 

In (b), J = (x*, xy, y’) is primary by Problem 9. The equality of J* with 
(Rx + 17) M (Ry + I’) holds by inspection. 

11. Arguing by contradiction, we can use the Noetherian property to obtain an 
ideal J maximal with respect to the property of not being a finite intersection of proper 
irreducible ideals. Since J is not irreducible, 7 = AM B nontrivially. By maximality, 
A and B are intersections, and then so is J, contradiction. 


12. Let Q be a proper irreducible ideal in R. Then 0 is a proper irreducible 
ideal in R/Q. We show that 0 is primary in R/Q, and then Problem 6 shows that 
Q is primary. Thus let xy = 0 in R/Q with y # 0 in R/Q. We want to see 
that some power of x is 0 in R/Q. In R/Q, we form the sequence of annihilators 
Ann(x) © Ann(x*) © --- and use the Noetherian property of R and its quotient R/Q 
to obtain Ann(x!) = Ann(x'+!) for some /. Let us see that the intersection (x!) N (y) 
isOin R/Q. In fact, ifa is in (y),then xy = 0 implies ax = 0, and if a is in (x!), then 
a = bx! and 0 = ax = bx'*!, from which we see that b is in Ann(x!+!) = Ann(x’). 
Therefore a = bx' = Oin R/Q. Thus indeed (x!) A (y) = 0. Since 0 is irreducible 
in R/Q and (y) ¥ 0, we conclude that (x!) = 0 and x! = 0 in R/Q. This is what 
we were to show. 


13. If ab is in Q and a is not in Q, then ab is in Q; for alli and a is not in Q;, 
for some ip. Since Q,, is primary, b” is in Q;, for some m, ie., b is in ./Q;, = P. 
Since ./Q; = P for alli, b“' is in Q; for some k; depending oni. Taking N to be 
the maximum of the integers k;, we see that b¥ is in each Q; and hence is in their 
intersection Q. Thus Q is primary. 

Problem 7b shows that ./O © P. On the other hand, if b is in P, we have just 
seen that some power b” lies in Q. So b lies in /Q. Therefore ./O = P. 


14. Problem 11 shows that every ideal is the finite intersection of proper irreducible 
ideals, and Problem 12 shows that these are primary. Thus if J is given, we have 
I = ()Q; with each Q; primary. Group all Q;’s whose associated prime ideal is 
the same P;, and denote the intersection of these by Q'. The ideal oF is primary 
by Problem 13. Then J = () O', and the oF have distinct associated prime ideals. 
So condition (ii) is satisfied. Finally among all expressions for J as intersections 
satisfying (ii), choose one that involves the smallest number of primary ideals. This 
minimality forces (i) to hold. 


Chapter VIIT 


Lig =Dfe—)H1+4 ¢4+ 97+ =e 
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3. It is enough to consider a monomial F(Xj,...,X,) = X™---X% with 
ey a; = d. Then Xj (X™ soe X%) = aj XM. X%, and the sum on j equals 
dX% ... Kon, ‘ 

4. If f' and g‘ have a nontrivial common factor in B[X], then 0 = R(f‘, g') = 
(Rf, g)). Since t is one-one, R(f, g) = 0. Therefore f and g have a nontrivial 
common factor in A[X]. 

5. Let us show that if g,, 4 0 and f,, = 0, then Theorem 8.1 for indices (m — 1, 1) 
implies the theorem for indices (m,n), and vice versa. Assume for the moment that 
m > 2. Let R(f, g) be the resultant matrix of size m + n that takes into account all 
coefficients fo,..., fr, of f, and let R(f, g) be its determinant. With f,, = 0, let 
R’(f, g) be the resultant matrix of size m +n —1,and let R’(f, g) be its determinant. 
The matrix R’(f, g) is obtained by erasing the m"™ row and last column of R(f, g). On 
the other hand, the only nonzero entry in the last column of R(f, g) is g,. Expansion 
in cofactors therefore gives R'(f, g) = gnRCf, g). The hypotheses of Theorem 8.1 
apply to f and g for either of these resultants, and we have just seen that the two 
conditions (c) are equivalent. Certainly the two conditions (a) are equivalent. For the 
two conditions (b), the resultant of size m +n — | tells us that a’ f + b'g = R'(f, g) 
with dega’ < n and degb’ < m — 1. Certainly this implies that af + bg = R(f, g) 
with a = a’g, and b = b’g,. Conversely if af + bg = R(f, g) with dega < n and 
degb < m, we define a’ = ag7! and b' = bg7!. Then a’ f +b'g = R'(f, g) with 
dega'’ < n, and we need to see that deg b’ = degb < m — 1. Since f,, = 0, all the 
powers of X inaf are < (n — 1) + (m — 1), and the same must be true in bg. Since 
g has degree n, we must have degb < m — 2 < m — 1, as required. 

Next we check what happens when m = | and we are comparing the resultant of 
size n + | and a degenerate resultant whose matrix is of size n and contains only the 
entries of g. The determinant formula is still valid, and we see that R’(f, g) = 35, 
which is nonzero. Thus (a) and (c) are false for both sizes. For (b), we cannot have 
af + bg =0 with degb < 0 and b £0. We need to check that af + bg = 0 cannot 
happen with dega < n and degb < 1; in fact, then degbg = deg g = n, while 
fi = 0 implies that degaf < n+deg f =n. So we cannot have af + bg = Oin 
this case either. 

The result of these calculations is that Theorem 8.1 for (m, 1) is equivalent to the 
theorem for (m—1,n) if g, A Oand f,, = 0. Using induction, we see that the theorem 
for (m, n) is equivalent to the theorem for (k,n) if g, A Oand fi41 =--- = fin = 0. 
Taking k = deg f gives the desired result. 

6. Proof via Nullstellensatz: Since f is irreducible and K[Xj,..., X;,] is a unique 
factorization domain, the principal ideal (f) is prime. Corollary 7.2 shows that g lies 
in (f): hence g = hf for some h. 

Proof via resultants: The idea is to arrange to have 


af +bg = R(f, 8), (*) 
with the resultant taken with respect to X,,. Proposition 8.1 shows that this happens 
if f and g are of positive degree in X,,, and we shall show that either this is the case 
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or else f divides g for easy reasons. Since f is nonconstant, it depends nontrivially 
on some X;, and renumbering the variables allows us to assume that f depends 
nontrivially on X,,. Then f is of the form 


f(X1,..., Xn) 
= co(X1,.-., Xn-1) $1(K1, ---, Xn-1) Xn He He (M1, «.., Xn-1) X), 


with r > 0 and with c, nonzero in K[X\,..., Xn-1]. If g = 0, then certainly f 
divides g. So we may assume that g 4 0. Choose aj, ..., @,—1 in K such that 


B(a1,.--,An—1, Xn)Cr (1, .--,4n—1) FO. (6) 


Then f(a1,...,@n—1, Xn) is a polynomial in X,, whose coefficient of X/, is nonzero. 
Since K is algebraically closed, this polynomial in X,, has a root, say a,. Since 
f(au,..-,4n) = 0, the hypothesis shows that g(q1,...,@n—1,4n) = 0, and (#*) 


allows us to conclude that g = g(X,,..., X,) depends nontrivially on X,. This 
proves (x). 
To complete the proof, we show that c, R is 0 at every point (b1,..., b--1). Since 


K is infinite, it will follow that the polynomial c,R is 0; thus R = 0 because c, 
is not the 0 polynomial. Then f and g will have a nontrivial common factor by 
Proposition 8.1, and f will have to divide g because f is prime. Thus suppose that 
cr(b1,...,b;-1) # 0. Then f(b, ..., b--1, Xn) is a nonconstant polynomial in X,, 
and must have a root b,, since K is algebraically closed. Hence f(bj,...,b,) =0, 
and the hypothesis on g shows that g(b,,...,b,) = 0. By (*), R(by,...,b--1) = 90. 
This completes the proof. 
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8. The resultant matrix in the W variable is 


xy*—y> —2x?y?—_ x3 0 
0 XY4*—y> —2x?y? x3 
y+ ys -xX? 0 2 
0 y+ Yo « 3X4 


and its determinant is —X*¥°(Y — 2X)*. Substituting into either of the equations 
F =0 and G = 0 gives the projective solutions (x, y, w) equal to (1,0, 0), (0, 0, 1), 
and (1,2,4+ 4,/2), up to nonzero scalar factors. (One has to check that both the 
equations F = 0 and G = O are satisfied.) 


9. Introduce a new indeterminate T = Y; — Z;, and remove Y;. Then R(F, G) = 
R(Y,...,T + Zj,...,¥m,Z1,.-., Z,) is a polynomial in T, the Z;’s, and all the 
Y’s except for Y;. Also, R(F', G) = 0 when T is set equal to 0. Hence R(F, G) is 
divisible by 7. Then (a) and (b) follow. For (c), the polynomials Y; — Z; are distinct 
primes. Since each divides R(F, G), their product must divide. Their product has 
the same degree as R(F, G), and the result follows. 
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10. We may assume that K is algebraically closed and that f is monic, say with 
f(X) = [TFL (X — &) and f'(X) =m aay (X — nj). Then the previous problem 
gives f’(&) = m[]"' & — nj), and 


RF. £1) = meni TG — 19) = mms TL PED 
i,j i= 


with Cm m—1 equal to the constant c from Problem 9c when n = m — 1. According 
to Section V.4, the product is (—1)”"—/”) times the discriminant D(f) of f. So the 
result follows. 

11. Replace G by G(X, Y, W) — (X* + Y*) F(X, Y, W) to get YWH(X, Y, W), 
where H(X, Y, W) = (X* + Y”)(X? — 3Y7) —4X?YW. Then 


1(P, FAG) =1(P, FAYWH) =1(P, FOY)+1(P, FAW) +1(P, FOB). 


For 1(P, F MY), we use the method of Section 4, looking at F(t, 0, 1), which is tt: 
thus 7/(P, FN Y) =4. Since P is notonW,1(P, FAW) =0. 

For 1(P, F 1H), replace H by H(X, Y, W) — F(X, Y, W) to get YJ (X, Y, W), 
where J(X, Y, W) = —4X°Y —4Y? —7X*W + Y°W. Then 


I(P, FO H)=1(P,F OYJ) =1(P, FAY) +1(P, FO J), 


and again /(P, F 1 Y) = 4. If the local expressions of F and J are denoted by f 
and j, then their lowest-order terms /3(x, y) and j2(x, y) are given by 


falx, y) = 3xy — y3 = y(V3x + y)(V3x- y), 
p(x, y) = —7x? + y? = —(V7x + y)\(VW7x — y). 


Thus F and J have no tangent lines in common at P, and /(P, FN J) =3-2=6. 
Collecting the results, we find that 7(P, FOG) =4+4+4+6= 14. 


12. Let P = [xo, yo, wo], and choose ® € GL(3, K) with ®(xo, yo, wo) = 
(0,0,1). The local versions of G and L are g(X,Y) = G(®~!(X, Y, 1)) and 
1(X, Y) = L(®7!(X, Y, 1)). The expansion of g as a sum of homogeneous poly- 
nomials is g = gy, +---+ gq because m = mp(G) > O, and / is of the form 
(Xx, Y) = aX +DbyY because P lies on L. We can parametrize / by g(t) = (bt, —at), 
and then the definition of intersection multiplicity is that 7(P, L M G) is the least 
integer k such that the expression g;(g(t)) = tk gg (b, —a) is nonzero. The defi- 
nition of tangent line is any projective line L; whose local version /; is one of the 
factors of gm(X,Y) = c[]; (@iX + BY)". Then gn(g)) = t”gm(b,—a) = 
cI, (aib — Ba)”. If (a, b) is a multiple of some (a;, B;), then g,,(g(t)) = 0; hence 
I(P, LAG) =>m-+1. Otherwise g,(g(t)) 4 0,and 7(P, LNG) =m. 
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13. The linear span LT(/) of the members LT(f) for f in J is a monomial ideal 
and is of the form (Mj, ..., M;) for suitable monomials M; each of the form LM( fj) 
for some fj in. Then {f|,..., f,}is a subset of J such that (LT(f1), aehes LT(fi)) = 
LT(/), and {f1,..., f;} is a Grobner basis of J by definition. 


14. If a, B, y are vectors of exponents in monomials such that the first i with 
wo .a Aw”. Bhasw”-a > w” - B, then it equally true that the first i with 
w)-(aty) Aw”. (B+y) has w”-(at+y) > w” - (B+). This proves that 
property (i) of monomial orderings holds with no further conditions on the weights. 
Property (ii) says for each vector a of nonnegative exponents not all 0 that the first i 
with w -a@ 4 0 has w - a > 0. Applying this condition as a necessary condition 
to the j" standard basis vector a = e ;, we see that the first i such that wi? # 0 must 


have w\”’ > 0 for (ii) to hold. On the other hand, if this condition holds for all j, then 
a suitable positive linear combination of these conditions gives (ii) for any a. 

15. In (a), a > a’ implies that X7~“ > X > Y” for all b' > 0. Multiplying 
by X@ gives X¢ > X“Y*'. Since Y? > 1 implies X¢Y? > X“, we conclude that 
x¢y> > X“y” for all b and b’. For a = a’, we observe that b > b’ implies that 
y>- = 1 andhence that Y’ > Y”. Multiplying by X@ gives X°Y? > X¢Y". Hence 
the ordering is lexicographic. 

In (b), we observe that an inequality between X“ and Y° implies the same inequality 
between X” and ¥””. Consequently the particular inequality for X“ and Y’ depends 
only on the rational number a/b. The assumption for (b) is that X < Y%, hence that 
X¢< YY" <y? if qa < b, thus if a/b < qos Thus the set S of rationals a/b such 
that X° > Y? is bounded below by q~!. Let r~! be the greatest lower bound of S. We 
know then that rie <r! hence that r < q.So0 <r < w%, andr is a well-defined 
real number. 

Suppose that w/v < r~!. Then w/v is not in S, and so X“ < Y”. In the reverse 
direction, suppose that u/v > r~'. Then there is some rational c/d in S with 
u/v > c/d = r7!; this has X° > Y4. Then X“4 > X*% > YY. Since d > 0, 
X" < Y” would imply X“4 < Y¥"@, which is false. Thus we must have X“ > Y?. 
This proves (b). 

For (c), the only rational w/v for which the inequality between X“ and Y¢ is not 
decided is u/v = r—!, and that only if r is rational. In this case a single weight vector 
will decide the correct inequality. All other inequalities between monomials follow 
from these. In fact, what needs deciding is the inequality between X“Y? and X”Y” 
when a > a’ and b < b’, and this is the same as the inequality between X“~“ and 
ae 

16. The formulas for f are a matter of computation. Both satisfy the conditions 
of Proposition 8.20 because LM(f) = XY is > each of LM((X + Y)fi) = xe 
LM(1 fx) = Y?, LM(Xf}) = X7Y, and LM((X + 1) fo) = XY? and because no term 
of 7) or r2 is divisible by LM(f1) = XY or LM(f2) = y?: 

17. In (a), we check that {X* + cXY, XY} is a Grébner basis using Theorem 8.23. 
The leading monomials of the two generators are X” and XY, and neither divides the 
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other. Since the leading coefficients are 1, this Grébner basis is minimal. 

In (b) whenc 4 0, X * + ¢XY has a nonzero term whose monomial is divisible by 
the leading monomial of another generator; specifically the term cXY in X* + cXY 
is divisible by the XY from the other generator. Following the procedure in Theorem 
8.28, we find that {X?, XY} is the reduced Grobner basis. 


18. If (cj, ..., ¢n) lies in Vx (J), then c; is one of finitely many roots of P;(X), 
for each j. Hence |Vx (7)| < []j_1 deg Pj. 


19. Fix j, and choose a polynomial Q; in X that vanishes at the j ‘h coordinate of 
every member of Vx (1). Then P}(X1,..., Xn) = Q;(X;) is a polynomial vanishing 
on Vx (/), and the Nullstellensatz shows that some power of it is in 7. The result is a 
polynomial in X; alone, as required. 


20. If Vx (J) is a finite set, then Problem 19 shows that J contains a nonconstant 
polynomial in X; foreach j. The leading monomial for the j ‘h such polynomial has to 
be a power of X;, and it lies in LT(/). Conversely suppose that a power x lies in LT(/) 
for each j. Form a reduced Grobner basis of J. Since the only monomials dividing 
xe are powers of X;, there exist members g; of the Grdbner basis for 1 < j <n 
such that 


i-1 
Bj(X1,-06, Xn) = Xp +X apm +++ + Xjaj1 + aj 

for suitable polynomials aj,m,-1,...,4j,0 iN Xj41,-.-, Xn. Then Vx (J) is contained 

in Vx ((g1,---; n)), and any member (c),..., c,) of the latter has the property for 

each j that c; isaroot of apolynomial of degree m; in one variable, once (cj+1, ..., Cn) 

is fixed. Thus Vx (/) is contained in a finite set and has to be finite. 


21. For (a), the coefficients a;,,__;, are given as in K(X), and we look for solutions 
of F(T,,..., T,) = 0. Clearing fractions in the coefficients, we see that it is enough 
to find a solution when each a;,,__;, has denominator 1. 

For (b), substitution of 7; = ae bij X/ , where each b;; is an unknown in K , into 
the equation F(7;,..., T,) = 0 gives 


We expand this out and set the coefficient of each power of X equal to 0. The largest 
possible power of X that can appear is the sum of the largest power of X in any d;,____i,, 
namely 5, and )-7_, Nix. Since F is homogeneous of degree d, )~7_ i, = d. Thus 
the largest possible power of X is Nd + 5. We get one equation for each power of X 
that appears, and the unknowns are the various b;;’s. 


22. The number of equations is < Nd +6-+ 1, since the powers of X go from 0 to 
at most Nd + 6. The number of unknowns is one for each index i with 1 < i <n and 
each possible power of X from 0 to N, hence exactly (N + 1)n. For N sufficiently 
large we want to see that N™d +6+1< (N+ 1)n. Since d < n, the inequality in 
question is 6 + 1 —n < N(n — 4d), and this is satisfied by taking N large enough. 
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23. In the context of Problem 22, we have a homogeneous system with more 
unknowns than equations (for large N)). If the number of unknowns is n + 1 and 
the number of equations is m, then we are looking for solutions in P{. Since the 
inequality m < n is satisfied, the quoted theorem applies and produces a nonzero 
solution for the b;;’s. 
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1. For (a), we argue by contradiction. Suppose that cj (x), ..., C,(%) are members 
of k(x), not all 0, such that yj cj(x)t; = 0. Clearing fractions, we may assume that 
each c;(x) lies in k[x]. If necessary, we can divide through by a power of x and 
arrange that some c;(x), Say cj)(x), has a nonzero constant term. The element x is 
by assumption transcendental over k. Applying the substitution homomorphism of 
k[x] into k given by evaluation at 0 yields }~ ;¢j(O)t; = 0. By the assumed linear 
independence of ty, ..., t, over k, cj(0) = 0 for all j. This contradicts the fact that 
cj(0) 4 0. Then (b) is immediate. For (c), we know that [F : k(x)] < ov, and 
therefore [k’(x) : k(x)] < co. By (b), [k’ : k] < oo. 


2. This is immediate from Proposition 7.15. Alternatively, here is a direct proof. 
We may assume that the characteristic is p. It is enough to prove that if K is perfect 
and L is a finite extension, then L is perfect. Arguing by contradiction, we may 
assume that [L : K] is as small as possible among all counterexamples. The image 
M of L under x +> x? is a subfield of L, and M contains K because K is perfect. 
We cannot have M = L, since L is assumed not to be perfect. By construction of 
L, M is perfect. Composing x +> x? from L into M with x +> x!/? from M into 
itself, we obtain a field map of L onto M that fixes M. The result is a one-one M 
linear transformation of the finite-dimensional M vector space L onto a proper vector 
subspace, contradiction. 


3. Let F be a function field in one variable over k. Since k is perfect, Theorem 
7.20 shows that F is separably generated. Let us write F = k(x, ..., x,). Theorem 
7.18 shows that there is some x; such that F is a separable extension of k(x;). If we 
write x for x;, then the Theorem of the Primitive Element shows that F = k(x)[y] 
for some y algebraic over k(x). Put R = k[x]Ly] = kLx, y]; the field of fractions of 
R is F. Let g(x, Y) be the minimal polynomial of y over k(x). If d(x) is acommon 
denominator for the coefficients of g(x, Y), thend(x) 4 0 because x is transcendental 
over k. If we set f (X, Y) = d(X)g(X, Y), then f(x, y) = 0. Hence the substitution 
homomorphism k[X, Y] > R given by replacing X by x and Y by y factors through 
to a homomorphism ¢ carrying k[X, Y]/(f (X, Y)) onto R. The ring R is an integral 
domain; hence the ideal (f (X, Y)) is prime, and f(X, Y) is irreducible. We can find 
an ideal J in k[X, Y] containing (f (X, Y)) such that g descends to an isomorphism 
of k[X, Y]/I onto R. This ideal J has to be prime, and we let J be a maximal ideal 
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of k[X, Y] containing it. Then we have a chain of inclusions of prime ideals 
CS(EOCYECT Ss, 


Theorem 7.22 shows that k[X, Y] has Krull dimension 2, and it follows that either 
(f (X, Y)) = T in the above chain of inclusions, or J = J. The latter equality would 
mean that J is maximal and therefore that R = k[X, Y]/J is a field; this is not the 
case, and thus (f(X, Y)) =/. Hence R = k[X, Y]/(f(X, Y)). Here f(X, Y) is an 
affine plane curve irreducible over k, and the field of fractions of R is by definition 
the function field of the curve; this field is F, and the argument is complete. 


4. The singular points are common zeros of f F 2g, and x. If there are infinitely 
many, then Bezout’s Theorem says that f and of have a nontrivial common factor, 
and so do f and x. Since f is irreducible and the partial derivatives reduce degrees 
in one or the other variable, we must have ag = af = 0 as polynomials. This is 
impossible in characteristic 0. In characteristic p, the first condition says that the 
only powers of X that appear in f are powers of X”, and the second condition says 
that the only powers of Y that appear are powers of Y”. The coefficients of f are 
powers of p because k is assumed perfect, and thus f is exhibited as a p' power, in 


contradiction to its assumed irreducibility. 

5. Differentiate f(X,b) = (X — a)fi(X) and evaluate at (a,b) to obtain 
gy (4,6) = fila) + @—a) f{(@) = fi@). 

6. Multiply the equation g(X, b) = (X—a)g,(X) by f| (X) and substitute to obtain 
g(X, b) fi(X) = f (X, b)gi(X). Then the function g(X, -) fi(X) — f(X, -)ei(X) 
is 0 at b and is of the form g(X, Y) fi (X) — f(X, Yygi(X) = YY — bhi (X,Y), 
where h;(X, Y) for each X is a polynomial in Y. Since (Y — b)h1 (X, Y) is equal toa 
polynomial in (X, Y),41(X, Y) is a polynomial in (X, Y). To complete the problem, 
evaluate both sides at (x, y), and use the facts that f(x, y) = 0 and that f, (x) £0. 

7. Since F = k(x, y) is a function field in one variable, it is enough to see that 
y is transcendental over k. Arguing by contradiction, suppose that there is some 
nonzero polynomial c(Y) in k[Y] having y as a root. As a polynomial in k[X, Y], 
c(Y) maps to c(y) = O when we pass to the quotient in k[X, Y]/(f (X, Y)), and 
therefore c(Y) is the product of f(X, Y) by a polynomial. On the other hand, ae is 
not 0, and thus f(X, Y) depends nontrivially on X. Hence the product of f(X, Y) 
and any nonzero polynomial in (X, Y) depends nontrivially on X , contradiction. The 
result now follows from the observation at the end of Section 1. 


8. Substituting a for x in the formula for g(x, y) gives 
g(a, y) = (y — b)*hkla, y)/filay’. 


In this formula, h;,(a, y) is a polynomial expression in y, hence also in y — b. Thus 
v; is > O onit. The expression f} (a)* is a nonzero member of k, on which v1 takes 
the value 0. Therefore 


ui(g(a, y)) =kuy(y — b) + ui (Aga, y)) = kui (y — B). 
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The left side is independent of k, and the right side is unbounded in k. Therefore 
there is some upper bound to the values of k for which g(x, y) has an expansion of 
the kind in question. 


9. For (a), we cannot have h; (a, b) = 0 in Problem 8 for arbitrarily large k because 
of the bound found in Problem 8. If k = n is the smallest k for which h;(a, b) £ 0, 
then the displayed formula holds with h = h,,. For uniqueness we substitute a for x 
and see that g(a, y) = pn(y)(y —b)” for a polynomial p, with p,(b) 4 0. We cannot 
have two such expressions involving distinct powers n because y is transcendental 
over k. 

For (b), we see from (a) that every nonzero member of R is of the required form 
with n > 0. Since F is the field of fractions of R, the same thing is true for F as long 
as we allow n to be arbitrary in Z. 

For (c), if we have two such expressions, we set them equal, clear fractions, and 
write the result as (y—b)* p(x, y) = q(x, y) forsome k > 0 and for some polynomials 
p and q with p(a, b) £ 0 and q(a, b) # 0. Substituting (a, b) for (x, y), we obtain 0 
from (y — b)* p(x, y) unless k = 0, and we obtain something nonzero from q(x, y). 
Therefore k = 0, and the required uniqueness follows. 


10. From the definition we immediately have v(g) = +00 if and only if g = 0, 
as well as v(gg’) = v(g) + v(g’) for all g and g’. We are to show that v(g + g’) 
min(v(g), v(g’)). Thus write g(x, y) = (y — b)"hy(x, y)/h2(x, y) and g'(x, y) 
(y — b)"hi (x, y)/hy(, y) with n < m. Then min(v(g), v(g’)) = min(n, m) = n. 
Also, 


Il IV 


yh +(y—by"" hgh, 
42 S00)! ae 
2. 


The numerator of the displayed fraction is a polynomial and can be written in the 
form of Problem 9a. Say that (y — b)* is the power of (y — b) that appears in it, 
k being > 0. Then v(g + g’) = n +k, and this is > n = min(v(g), v(g’)). The 
assertions about the valuation ring and the valuation ideal are clear. 

11. Let v’ be a second valuation having the stated properties. If g(x, y) is given 
in F*, decompose g as in Problem 9b, and apply v’. Then we obtain v’(g(x, y)) = 
nu'(y — b) + v'(hAy(x, y)) — v'(ho(x, y)). The assumptions on v’ show that 
v' (Ai (x, y)) = v'(Aa(x, y)) = 0. Therefore 


v' (g(x, y)) =av'(y — b) = v'(y — b)v(g(x, y)), 


and v’ = v’(y — b)v. By assumption, v'(y — b) is positive. Since v’ has to be onto 
ZU {oo}, we must have v’/(y — b) = 1. 

12. For (a), the argument is the same as with Problem 7 except that the roles of 
x and y are reversed. The partial derivative Heh = 2y is not the 0 element 
because the characteristic is not 2, and hence that earlier argument applies. Part (b) 
is elementary field theory, and (d) is a routine verification. 

For (c), let k’ be the subfield of elements of F algebraic over k. Problem 1 shows 
that [k’ : k] < [k’(x) : k@)] < [F : k] = 2. Arguing by contradiction, suppose 
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that {1,t} is a basis of k’ over k. let X? + uwX + v be the minimal polynomial of 
t over k; ¢ satisfies t? + ut + v = 0. Problem 1a shows that t = a(x) + yb(x) 
with b(x) # 0, and then f satisfies t? — 2a(x)t + (a(x)? — f(x)b(x)*) = 0. Hence 
ut + v = —2a(x)t + (a(x)? — f(x)b(x)?). If u 4 —2a(x), then we can solve 
for t and obtain the contradiction that ¢ is in k(x). Thus u = —2a(x), and also 
v=a (x)? _ f(x)b(x). Since x is transcendental over k, the first of these shows that 
a(x) does not involve x, i.e., a(x) lies in k. Then the second shows that f(x)b(xy* 
lies in k, and unique factorization leads to the conclusion that f(x) and b(x) do not 
depend on x. This contradicts the assumption that f(X) is nonconstant. 


13. Let z = a(x) + yb(x) be in the integral closure. Then so is the image of z 
under the nontrivial Galois group element o, and so are z + o(z) and zo(z). The 
latter elements are 2a(x) and a(x)* —f (x)b(x)*. Thus a(x) is in the intersection of 
the integral closure with k(x), which is k[x] because k[x] is a principal ideal domain 
and is integrally closed. Then f (x)b(x)? is in k[x] by the same argument. Since 
f (x) is square free, it follows that b(x) is in k[x]. 


14. Part (a) is immediate from Corollary 6.6. Discrete valuations of F that are not 
in Dr play no role because of the inclusion k C R: any discrete valuation that is > 0 
on R has to be 0 on k™, since the image of k* under the valuation is a subgroup of Z. 

For (b), the condition for z 4 0 to be in p(x). is that v(z) > —p ord, (x). for all 
uv € Dr. Ifa particular v has v(x) > 0, then v does not contribute to (x) o, and this 
condition says that v(z) > 0. By (a), z is in R. 


15. For (a), let c(x) = Cpx" +---+c9 =X" (Cy + Cox i eee + cox") with 
Cn #0. Then v(c,) = 0, and v(cjxI-") > 0 for j <n. Hence 


1 1 


v(x" (Cn + nix eee Ht cox ")) = nv(x) + v(Cp + Cn-1X~ 


= nv(x) + v(cn) = nv(x). 


+--+ +e9x™") 


For (b), 2v(y) = v(y?) = v(f (x)) = (deg f) u(x), the latter equality holding by (a). 
In (c), we have 


v(a(x) + yb(x)) = min (v(a(x)), v(yb(x))) 
= min (v(a(x)), v(y) + v(b(x))) 
= min ((deg a)v(x), (5 deg f + deg b)v(x)) 
= v(x) max (dega, 5 deg f + deg b) > p(x). 


16. Any v € Dr with v(x) => 0 has v(z) > 0 = —ord,(x) on all elements 
Z = a(x) + yb(x) with a(x) and b(x) in k[x], by Problems 13 and 14a. Suppose 
that v(x) < 0. Then Problem 15c and the assumptions on the degrees of a(x) and 
b(x) shows that v(z) > pu(x) = —p ord, (x)o9. Hence (z) = —p(*) 0, and z lies in 
L(P(x)oo). 
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18. For (a), let o be the nontrivial element of the Galois group. Problem 17c 
shows that if z = a(x) + yb(x) is in L(p(x)oo), then so is o(z) = a(x) — yb(x). 
Hence any v € Dg with v(x) < 0 has v(a(x) + yh(x)) = —pordy (x) = pu(x) 
and v(a(x) — yb(x)) > —pord,(x)o = pu(x). Consequently 


v(a(x)) = v(2a(x)) > min (v(a(x) + yb(x)), v(a(x) — yb(x))) 
> min (pu(x), pv(x)) = pv(x) 
and 
v(a(x)? = f)b(x)*) = v(a(x) + yb) + v(a(x) — yb(a)) = p(x) + pvc). 
Using Problem 15a and the fact that v(x) < 0, we see from these two inequalities 
that dega < p and deg(a* — fb?) < 2p. 

For (b), Problem 14b shows that L(p(x)o) CG R, and Problem 13 shows that 
R consists of all a(x) + yb(x) with a(x) and b(x) in k[x]. Part (a) thus shows that 
dega < panddeg(a*— fb*) < 2p. Sincedega < p,the second of these inequalities 
shows that deg fb? < 2p. Thus degb + 5 deg f < p. In the reverse direction, if 
a(x) and b(x) are polynomials satisfying the degree relations, then Problem 16 shows 
that a(x) + yb(x) is in L(p(x) oo). 

19. The polynomials a(x) and b(x) are limited only by the restrictions on their 
degrees. Fromdega < p,we getaspace of dimension p+1. Fromdeg b+ 5 deg f < 
p,we have degb < [p a 5 deg fi), and we get a space of dimension [p a : deg f| +1 
if [p — 3 deg f] = 0. Thus 


l(P(&)oo) = (p+ 1) + [p — Fdeg f] +1 
=2p+2+[- deg f]=2p+2—-[5(1 +deg f)] 


if p > —[ — 4 deg f] = +[5(1 + deg f]. 

20. Part (a) is immediate from Theorem 9.3, since [F : k(x)] = 2. For (b), Theo- 
rem 9.9 and Problem 19, in combination with the result of (a), show for sufficiently 
large positive p that 


1 — g = €(p(x)oo) — pdeg(x)oo = 2p +2 — [F(1 + deg f] — 2p. 


Hence g = [5a + deg f] =; 

21. Let ® : k(X)[Y] — k(X)[Z] be the substitution homomorphism that fixes 
k(X) and has ®(Y) = g(X)Z, and follow it with the quotient homomorphism to 
k(X)[Z]/(Z* — h(X)). Then 

O(Y* — f (X)) = 9(X)?Z* — f(X) = g(X)*(Z* — A(X), 
which goes to 0 in the quotient. Thus the composition of ® followed by the quotient 
map descends to a field map y : k(X)[Y]/(Y? — f(X)) > k(X)[Z]/(Z? — A(X)). 


The inverse is constructed in the same way, starting from the formula V(Z) = 
g(X)lY. 
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22. For (a), the conclusion genus | when there are no repeated roots is immediate 
from Problem 20b with deg f = 3. If there are repeated roots, then we can write 
f(X) = g(X)°A(X) with deg g = degh = 1. Applying Problem 21, we see that the 
genus is the same as for Problem 20b with deg f = 1, 1.e., the genus is 0. 

For (b), a singularity occurs only at points (x, y) of the zero locus in k?, at which 
both first partials areO. Then 2Y = 0, which says that y = 0 because the characteristic 
is not 2, and f’(X) = 0, which says that x is a root in Kalg of both f(X) and f'(X). 
This means that x is at least a double root in Kalg of f (X). 


23. The residue class degree f, is 1, since k is algebraically closed. Thus degnu = 
n. Corollary 9.4 gives (Ov) = 1, Corollaries 9.22 and 9.23 together give £(1v) = 1 
if g > 1, and Corollary 9.19 gives €((2g — 1)v) = deg((2g — 1)v) + 1 — g) = 
(2g —1)+CU-—g) =gand £(2gv) = deg(2gv) + (1 — g) = g + 1. The inequality 
L(nv) < €(n+ 1)v) < €(mv) + 1 follows by combining Theorem 9.6, the fact that 
A < B implies L(A) C L(B), and the fact that f, = 1. 


24. For eachn > 0, 


L(nv) = {0} U {« € F* | —@) uo = —nv} = {0} U {x € F* | &)oo < nv}. 
Thus n > | is a gap if and only if €(mv) = €((m — 1)v), and otherwise (nv) = 
£((n — 1)v) + 1 by the last fact in Problem 23. 

Suppose that there are m gaps in passing from £(Ov) to €(2gv). In the process we 
take 2g steps from (1 — 1)v to nv, of which m are gaps and 2g — m are nongaps. (The 
gaps are certain of these integersn,1 <n < 2g.) Since (Ov) = land €(2gv) = g+1 
by Problem 23, the total number of nongaps is (g + 1) — 1 = g. Solving 2g -m = g 
gives m = g. The formulas ¢((2g — 1)v) = g and €(2gv) = g + 1 from Problem 23 
show that 2¢ is not a gap. 


25. For (a), if the gap sequence is (1,2,...,g), then 1 = €(Ov) = €(1v) = 
£(Qv) =--- = €(gv). Conversely if the gap sequence is something else, let n with 
1 <n < gbethe first nongap; then 1 = €(v) = --- = €((n—1)v) < L(nv) < L(gv). 

For (b), Problem 23 gives (Ov) = €(1v) = 1 if g > 1, and thus | is a gap. 

For (c), there are no integers strictly between 0 and 2g if g = 1, and the only such 
integer for g = | is 1. Part (b) shows that the gap sequence is indeed (1) if g = 1, 
and thus the gap sequence is always the standard one. 

For (d), we have some x and y in F* with (x) = rv and (y)~o = sv. Thus 
(x) = ()o —rvand (y) = (yo — sv, and (xy) = @)o + Go) — © $5)v. Since v 
does not contribute to (x)o and (y)o, (XY)oo = (Y + 5)v, and thus r + 5 is a nongap. 

For (e), if 2 is a nongap, then iteration of (d) shows that 2,4, 6,...,2g — 2 are 
nongaps. The only possible gaps are the remaining integers from | to 2g — 1, namely 
1,3,5,...,2g — 1. There are g of these, and so all of them must be gaps. 
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1. If F is in /(P), expand F as a sum of homogeneous terms F = )\7-9 Fu. 
Then! =F (x6; 54 t%a) S 4 PC XOnes iden) = Dg passin aes Tor 
allt € k*. Since k is infinite, every coefficient of this polynomial in ¢ is 0. Thus 
each Fy is in /(P), and I (P) is generated by homogeneous elements. 


2. In each part we argue by contradiction. For (a), if {X_} is a system of nonempty 
closed subsets of X with the finite intersection property such that (|, Xu = ©, then 
we can inductively define a strictly decreasing sequence of finite intersections of the 
Xj’S, in contradiction to the Noetherian property. In (b), if E is a closed irreducible 
subset that is not connected, then E = U U V with U and V nonempty, disjoint, and 
relatively open. Then FE = U° U V* contradicts the irreducibility of E. 


3. For (a), the continuous image of a connected set is connected. Continuity is 
by Proposition 10.32, and connectedness is by Problem 2b applied to the Noetherian 
topological space V. For (b), if f is any polynomial function on A”, then f o ¢ is 
in O(V) because ¢ is a morphism, and f 0 g is constant by Corollary 10.31. Then 
y cannot have two distinct points in its image, since any two points in A” can be 
distinguished by some polynomial. 


4. Certainly O(U) D> k[X,Y]. Also, the function field k(U) consists of all 
quotients of polynomials a/b with a and b in k[X, Y] and b $ 0. Thus suppose that 
f = a/b lies in O(U). By unique factorization in k[X, Y], we may assume that a 
and b are relatively prime. In the expression f = a/b, regularity at P implies that 
b(P) 4 0 because an equality a/b = c/d of two such expressions implies thata = kc 
and b = kd for some nonzero scalar k. Since f is regular everywhere in A? except 
possibly at the origin, b(X, Y) is nonvanishing away from the origin. However, if 
b is nonconstant, then V(b) is a curve and has dimension 1, whereas the origin has 
dimension 0. We conclude that b is constant, and f = a/b is in k[X, Y]. 


5. Arguing by contradiction, let g : W — U be an isomorphism from an affine 
variety onto U. Then the map g : O(U) > O(W) = A(W) given by O(f) = f ogis 
an isomorphism. Let: : U —> A? be the inclusion. The corresponding map on regular 
functions isT: A(A?) > O(U) given by T(h)(x, y) = ue y) for (x, y) € (0,0), 
and it is an isomorphism by Problem 4. Then (gy 01)” =To@ is an isomorphism 
of A(A*) onto A(W). Its inverse has to be of the form v with v(g) = =e ad for 
some isomorphism w : A? + W, according to Theorem 10.38. Since v o@otis the 
identity map on A(A*), 1.0 @ 0 is the identity map on A’. Using the definition of 1 
shows that go w(x, y) = (x, y) for (x, y) € (0, 0). Thus go y is an isomorphism of 
A? onto U that is the identity on U. This is a contradiction, since there is no possible 
image for (0, 0) under g o w that makes g o w one-one. 

6. Let g be the rational map of the irreducible curve C into the irreducible curve 
C’, and let (E, gz) be a morphism in the class y. If g is not dominant, then gz (E) 
is a proper closed subset of C’ and must be finite. Hence y;(£) is finite. The set E 
is connected by Problem 2b, and morphisms are continuous by definition. Therefore 
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ve (E) is connected. Being connected and finite, it is a singleton set {y}. If gc is 
defined as everywhere equal to y on C, then (C, @c) is in the equivalence class g. So 
gy is constant. 


7. Suppose that f is a member of O,:p)(V) with g3(f) = 0. Since the set on 
which f € k(V) is regular is open, there exists an open neighborhood E of g(P) on 
which f is defined. The morphism 9 is continuous, and thus g~!(£) is open in U. 
Since y is a morphism and f is regular on E, f og is regular on g~!(E). According 
to the proof of Proposition 10.42, 3 (f) is defined to be the unique member of k(V) 
that agrees with f og ong !(E). We are assuming ~>(f) to be 0, and thus f og 
equals 0 on gy \(E). By dominance of 9, g(y7! (E)) is a dense subset of E. Thus 
the continuous function f is 0 on a dense subset of its domain E and is 0. 


8. The inclusion (WX — YZ) C (X, Z) yields a homomorphism ¢ of A(V) onto 
k[W, X, Y, Z]/(X, Z) = k[W, Y]. Let b’ = g(b). Then b’(w, y) = b(w, 0, y, 0) 
is a polynomial in (w, y) nonzero in the complement of the origin. The solution 
of Problem 4 shows that b’(0,0) 4 0. Thus b(O, 0,0,0) ~ 0, and f is defined at 
(0,0, 0,0). In view of the discussion of this example in Section 4, f is everywhere 
defined. Therefore it is in O(V), which equals A(V) because V is an affine variety. 
Thus there is a polynomial g in k[W, X, Y, Z] whose image g in A(V) equals X/Y. 
Then Yg = X,and Y¥g = X+ (WX —YZ)h for some polynomial h. So Y(g+hZ) = 
X(1+ Wh). This implies that Y divides | + Wh, which we see is impossible by 
evaluating at the origin. 


9. The equivalence of continuity of g and continuity of all g, will be taken as 
known. Suppose that g : U — V isamorphism. Let an index w,anopenset EF C Vy, 
and a member f of O(E) be given. We are to show that f o g, is in O(g;,'(E)). 
Since y is a morphism and E is open in V, we know that f o¢ is in O(y~!(E)). By 
restriction, f o gy is in O(Uy Ng !(E)) = O(yz!(E)). Thus @, is a morphism. 

In the reverse direction suppose that all g, : Uy — Vy are morphisms. Let EF 
be open in V, and let f be in O(E). We are to show that f og is in O(y!(E)). 
Since yg! (E) =U, Uan grt (E)), itis enough to prove regularity of f og on each 
U, Ag ~!(E). On this open set, f o g equals f o ¢, which is regular because gy is 
a morphism. Thus ¢ is a morphism. 


10. For (a), we use the equivalence of regularity with the condition in Proposition 
10.28. Thus regularity at P in U means that there is a subneighborhood Up of U 
within V about P such that f equals a quotient @/b on Up with @ and b in A(V) and 
with b nowhere vanishing on Up. Choose polynomials a and b in k[X1, ..., X,] that 
restrict to @ and b on V. Let Uo be an open subset of A” whose intersection with V 
is Up. Since b is nowhere 0 on Up and is continuous on U/, the subset of of U; on 
which b is nonvanishing is open and contains Up. Then Proposition 10.28 shows that 
F =a/b is amember of O(Uo) whose restriction to Up equals f. 

For (b), the result of (a) is local. Thus we can immediately allow V to be quasi- 
affine. Using Proposition 10.37, we can extend (a) to the case that V is quasiprojective. 
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11. Continuity is no problem. For the condition involving regularity, we use 
Problem 10. Let E be a relatively open set in V, and let f be in O(E). We are to 
show that f og is in O(y~!(E)). Thus let P be in gy !(E) C U; then g(P) is in 
E CV. Since f isin O(E), Problem 10 produces a relatively open neighborhood Eo 
of g(P), an open subset Eo of Y with Eo OV = Eo, and a function F in O(Eo) such 
that Fl, = 4 ee Since g : X — Y isamorphism, F o ¢g is in Ow! (Ep)). Since 
g(g— 1(E) NU)C Eo OV = Eo, F o p agrees with f o gon yo! (Eo) AU. Thus 
f og has an extension F o g from yo | (Eo) NU to y_|(Eo) that is in O(Eo). The 
quotients that exhibit F og as defined at points of or (E9)NU exhibit f og as defined 
there. The inclusion yg! (Eo) = gt (Eon V) = yg! (Eo) ‘a gy! V)C yg! (Eo) NU 
shows that f o g is in Og"! an This being true for all P in g -l(B), fogisin 
O(p"'(E)). 

12. Part (a) follows by applying instances of Problem 11 to g and y~!. Then 
(b) follows by another application of Problem 11. Part (c) follows by inductive 
application of (b). 

13. Let d; be the degree of homogeneity of F;. Then the i" row of the right-hand 
matrix is A“! times the i row of the left-hand matrix. Hence the dimension of the 
span of the rows is the same for the two matrices, and this number is the rank. 


14. This comes down to the fact that differentiating with respect to X; for j > Oand 
then setting Xo equal to | is the same as setting Xo equal to | and then differentiating 
with respect to X;. 


15. For any of the functions F;, the right side of the formula in Euler’s Theorem is 0 


at (xo, ...,Xn) by assumption. Hence Euler’s Theorem gives xo aE (XO, jek = 
>a Xj 5x 0, ..., Xp). This says that 
n 
xo x 0" column of J(F)(x0, ---, Xn) = — >) x; x j™ column of J(F) (xo, ..., Xn). 
j=l 


Since x9 4 0, this is a relation of the required type. 


16. Problem 13 shows that the left side equals rank J(F')(1, x1/Xx0, ..-,Xn/X0), 
which Problem 15 shows to be equal to the rank of the matrix formed from the last n 
columns, which Problem 14 shows to be equal to the rank of J (f)(x1/X0, ..-,Xn/X0). 


18. Regard the elements w,; as the entries of a matrix. The given condition is 
that every 2-by-2 subdeterminant of this matrix equals 0. The matrix is not 0, and 
consequently its rank is 1. Every matrix over k of rank 1 is of the form xy’ for column 
vectors x and y, and then [{w;;}] is exhibited as o([{x;}], [{y;}]). 


19. For (a), one suitable monomial ordering is the lexicographic ordering that 
takes the elements W;; in the order Woo, Wo, .--, Winn with Woo largest. Given a 
monomial M’ of total degree d, choose among all monomials of total degree d the 
smallest one in the ordering that is congruent to M’ modulo a. Write M = We j we 
Ifa;; > 0 and if there exists (k,/) with! > j,k > i,anday; > 0, then W;; Wy, divides 
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M. Write Mo = M/W; Wu- Put M” = Mo Wii Wx;. Since Wij Wai = Wii We; is in 
a, M” is congruent to M modulo a. In the monomial ordering, all of the elements 
Wu, Wir, Wij; are smaller than W;;. Therefore M” < M, in contradiction to the 
minimality of M. 

In (b), let the largest W;; whose exponents in M and M’ are unequal be W,, ;,. Let 
the products of the powers of the strictly larger monomials be N and N’, respectively. 
It is enough to prove that p(M/N) 4 y(M'/N’'). Then we have 


M/N= [I Wi! = wi Tl OW 


ij io jo Jil 
Wij SWig jo (i,j) with 


ig <i or 
(igo=i and jo <j) 


and a similar expression for M'/N’. The minimality condition says that a;; = 0 if 
ig <iand jo < j. Thus 


M/N=( TI Wy')( TL W’) = (Mees Mec We") (Then Waal’) 


io<i, jo=j io=i, jo<j 


aj aj 
and CON) = (Tes, Des Xo Ca ae 


On the right side each pair of indices (k,/) occurs at most once. Thus an equality 
y(M/N) = g(M'/N') would imply that ayy = by for every (k, 1). This proves (b). 

In (c), we know that a C ker g. If equality fails, then there is a linear combination 
>=, ¢r-M, of monomials in ker g that is not in a. Applying (a), we may assume that 
each M, is reduced. Then >>, c,g(M,) = 0. Each g(M,) is a monomial, and (b) 
shows that the various monomials y(M,) are distinct. Since the set of monomials is 
linearly independent, each c, is 0. Therefore De c,M, = 0, contradiction. 


20. For (a), compute the kernel of the natural substitution homomorphism of 
k[Xo0,..-, Xm, Yo,---, Yn] into R[Yo,..., Y,]. For (b), let P = [yo, y1,.--, val, 
p= 1(U) C k[Xo,..., Xm], and q = 1({P} C k[Yo,..., Y,]. The inside homo- 
morphism has kernel a by Problem 19. The outside homomorphism takes Xo, ..., Xin 
into R and takes each Y; to y;Z, where Z is an indeterminate; its kernel is isomorphic 
to pq. The kernel of the composition is (a (U x {P})), which is prime because R[Z] 
is an integral domain. 

21. See Fulton’s book, page 145. 

22. See Fulton’s book, page 146. 

23. For (a), Proposition 10.9 shows that /(V (J)) = (A(X, Y)) for an irreducible 
polynomial h if dim V(/) = 1. The containment J € I(V(J)) shows that each f; 
has to be of the form f; = ajh for some a; ink[X, Y]. Since f; and / are irreducible, 
a; has to be a scalar. Thus J = (h(X,Y)), and J is prime. For (b), one can take 
I = (¥ + X*, Y — X?), which has V(/) = {(0, 0)} and which is not prime because 
it contains X* but not X. 
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24. Let {g1,..., gs} be a minimal Grébner basis, and suppose that gj; = ab is a 
nontrivial factorization of g; ink[X1, ..., X,]. Since / is prime, we may assume that a 
lies in 7. Then LM(g;) = LM(a) LM(b), and LM(a) lies in LT(/). Since {g1, ..., gs} is 
a Grébner basis, LM(a) lies in the monomial ideal (LM(g1), ..., LM(gs)). By Lemma 
8.17, LM(g;) divides LM(a) for some i. It follows that LM(g;) divides LM(g;). Since 
the Grdbner basis is minimal, i = j. That is, LM(g;) = LM(a) = LM(g;). Thus 
LM(b) = 1, in contradiction to the assumption that the factorization of g; is nontrivial. 


25. Identify ay, X? + 2aj2XY + ay? + 2a,3XZ + 2anVYZ+ a33Z with the 
symmetric matrix 
411 412 413 
A = (« a22 =) ‘ 
413 423 433 
By the Principal Axis Theorem choose an invertible matrix M such that A’ = M'AM 


x’ x 

is diagonal. Put ( r) = M~'(y ) and substitute. Then the given quadratic 
Z Z 

polynomial equals aX’* + BY’? + yZ’?, where a, B, y are the diagonal entries of 

A’. If aBy = 0, this is reducible; it is readily checked to be irreducible if aBy #4 0. 

Since aBy = det A’ = (det M)? det A, the reducible polynomials correspond to the 

affine hypersurface on which det A = 0. 


26. The first conclusion is a special case of Corollary 9.19. Then take x to be 
a nonconstant member of L(2vg), and take y to be a member of L(3vq) not in the 
linear span of {1, x}. Corollary 9.22 shows that (x). = 2, and then the equality 
(y)co = 3 follows from the definitions. 

27. These are special cases of Theorem 9.3. 

28. Since 2 = [Kk(E) : k(@)] = [k(Z) : k(x, y)] [k@, y) : k@)], the integer 
[k(E) : k@, y)] divides 2. The corresponding equality with 3 and k(y) shows that 
[k(E) : k(x, y)] divides 3. Therefore [k(E) : k(x, y)] = 1. 

29. The values of vg on the seven listed members of k(£) are 0, 2, 3, 4, 5, 6, 6, 
respectively. The members are all in L (6vg), which has dimension 6 by Problem 28, 
and thus the listed members are linearly dependent. If y* or x? does not contribute 
to this dependence, then vg takes distinct values on the remaining six members of 
L(6vg), and Problem 19a at the end of Chapter VI gives a contradiction. Hence the 
coefficients b and c of y* and x3, respectively, are nonzero. If x and y are replaced by 
—bcx and bc*y and if the linear combination of terms is then divided by b°c*, then 
the linear dependence takes the form (y? +ayxy +a3x)— (x3 +ayx? +a4x +a6) = 0, 
as required. Hence ¢ carries E — {0} intoC A’. 

30. Certainly f(X, Y) is not divisible by any nonconstant polynomial in X. Thus 
the only possible reducibility is of the form f(X, Y) = (Y + p(X))(Y + q(X)). 
Expanding out the right side shows that 


P(X) +q(X) =aX +4, 
p(X)q(X) = —(X? + a2 X? + a4X + 46). 


712 Hints for Solutions of Problems 


The second equation shows that at least one of p(X) and q(X) has degree > 1, and 
then the first equation shows that deg p(X) = degq(X). But this equality would 
mean that deg p(X)q(X) is even, contradiction. Hence f(X, Y) is irreducible. 

31. The function ¢y is a morphism of E — {O} into CN A? by Lemma 10.39, and 
the composition with Bo is a morphism into P*. Then g is a morphism of E — {O} 
into C by Problem 11. The class of (E — {O}, ¢) is therefore a rational map of E 
into C, and Corollary 10.54 shows that g extends to a morphism ® : E > C. 

32. Let ® : k(C) — k(E) be the field mapping that corresponds to ® under 
Theorem 10.45. The field k(C) is generated by the functions x9 and yo that pick 
out the coordinates of points of CN A’, and Theorem 10.45 shows that (xo) = = 
(class of x9 o g). For P in E — {O}, this has O(xo)(P) = = x0(9(P)) = x(P), Le., 
P(x) = x. Similarly P(yo) = y. Therefore ®(k(C)) = k(x, y). By Problem 28, 
® is onto k(E). By Corollary 10.46, ® is birational. 

33. The homogeneous polynomial of degree 3 from which f(X, Y) arises is 


F(X, Y, W) = (Y°W +a, XYW +.a3YW?) — (X? +.a2X°W + a4XW? +. a6W?). 


The points of C on the line at infinity arise by setting W = 0 and F(X, Y,W) = 
0 simultaneously, and the only such point is [0, 1,0]. Computation shows that 
0, 1,0) = 1. Consequently [0, 1, 0] is a nonsingular point of C. 


34. A point (xo, yo) in A? is a singular point of C if and only if f (xo, yo) = 
Le ey F (x0, yo) = af (xo, yo) = 0. At (x0, yo), computation shows that 


So. oe 


rf _ 
: 6X — 2a, a= 


ey 4 #® 
ax2 — axoy — %l> py? 


All higher-order derivatives are 0. Application of Taylor’s formula about (xo, yo) 
therefore gives 

f (X,Y) = (3x0 — a2)(X — x0)” + a1(X — x0)(¥ — yo) + (Y — yo)” = (X = x0)°. 
We put X = x and Y = y, taking into account that f(x, y) = 0. After division by 
(x — x0), the result is that 
~!)? + ai(y — yo)(w — x0)! = (x0 + a2) + (X — x0). 


That is, z7 + ayz = (3x0 + az) + (x — x0). Suppose that P is in E — {O} and that 
up(z) < 0. Then we have vp(z + a1) < 0 and 


(Cy — yo)(% — Xo) 


0 < vp((x0 +42) + (x — x0)) = vp(z? +. a1z) = up(z) + up(z +41) <0, 


contradiction. Therefore vp(z) > 0. Meanwhile, vg(x — x9) = vo(x) = —2 and 
vo(y — yo) = vo(y) = —3. Hence vo (z) = (—3) — (—2) = — 

35. Corollary 9.22 shows that no member of k(E) has the properties of z found 
in Problem 34. Thus C is nonsingular at every (xo, yo). In combination with Prob- 
lem 33, this shows that C is everywhere nonsingular. By Corollary 10.55, ® is an 
isomorphism. 
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INDEX OF NOTATION 


This list indexes recurring symbols introduced in Chapters I through X (pages 
1-648). For recurring symbols introduced in Basic Algebra, see the list of 
Notation and Terminology on pages xxi—xxiv. Some of the latter notation has 
been repeated here for the reader’s convenience. 

In the list below, each piece of notation is regarded as having a key symbol. 
The first group consists of those items for which the key symbol is a fixed Latin 
letter, and the items are arranged roughly alphabetically by that key symbol. The 
next group consists of those items for which the key symbol is a Greek letter. The 
final group consists of those items for which the key symbol is a variable or a 
nonletter, and these are arranged by type. To locate an item below, first proceed 
on the assumption that the key symbol is a Latin or Greek letter; if the item does 
not appear to be in the list, then treat it as if its key symbol is a variable or a letter. 


A, Ax, 389, 559 coimage f, 240 
A”, Aj, 455, 559 coker f, 175 
A(K, Gal(K/F), a), 137 Dé), 279 

Ar, 542, 543 D(K/F), 372 
Ax, 542, 543 Dp, 532, 549 
A(V), 579 Dro, 534 

A, 570 Dx, 267 

Ag, 570 D(), 267 
A(V), 584 Diff(F), 547 
A(V)a, 585 Div(w), 548 
Kaig, 434 d_,, 194 

“0, 639 d,, 153 

B(F), 126 X = {(Xn, dr} _ 4, 174 
B(K/F), 127 dim R, 424 

C, 330 Extp(A, B), 223 
C(a), 620 ei, fi, g, 275, 354 
Cr, 169 (Cjis cae pCR) 619 
C(V(a)), 633 extp(A, B), 223 
Cr, 532, 549 F,[[X]], 347 
Cro, 534 F,((X)), 347 
E*°, complement, xxi Fy, 346 
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Fry, 437 

fv, 533 

Gp, 368 

Gal(F>, F)), 434 
<GLEX’ =GREVLEX’ 
g, 538 

2x, 538 

H(s, a), 633 
Has, a), 621, 626 
H(s,a), 633 
H,(s,a), 625, 628 
H;, 620 

H,,(X), 153, 172 
H"(X), 153, 174 


494 


H,(X), 172 
H*(X), 174 

A, (G, M), 209 
H"(G, M), 147 
Hom,(A, B), 169 
h(D), 7, 14 

hx, 299 

I, Ix, 390 

I', 390 

T, 330, 393, 576 
T, 576 

I= (r1,12), 38 
I= (71,11); 38 


I(E), 560, 571 
I(P, FG), 474 
I(P, LAF), 467 
image f, 240 
J(é), 272 

K(S), 409 
K(E), 412 

k, 528, 559 
k(V), 580, 585 
k’, 531 

L(A), 544 

L(A), 535 

L(s, x), 63 
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LCM(X*, X*), 501 
Log, 289 

LM(f), LC(f), LT(f), 496 
LT(1), 497 

SEX? 493 

£(A), 536 

lim, 439 

M, 493, 620 

Mp, 600 

M,, 431 

Mp, 600 

m,, 431 

mp(F), 474 

N(), 39, 273 
Najr(-), 165 
Nx/r(-), norm, xxiv 
Nrda/r(-), 165 
O(U), 580, 582, 587, 641 
Op(U), 582, 587 
Op(V), 580, 585 
R°, opposite ring, xxii 
ord,(A), 532 

P?, 456 

P", 457, 570 

Py, 457 

P, 330, 393 

Pr, 532, 549 

P,, 322, 533 

Q,, 316, 318 
R(f,g), 451 

R(f, g), 451 

R(fi, F), 514 

Ry, 346 

Ry, 322, 533 

R,, 431 

Residue, 542 
Residue,(y), 541 

r1, 12, 348, 383 

rad A, 79 

S(fi, f2), 502 


Soo, 391 

S~!R, localization, xxiv 
Spec A, 639 

(Spec A, O), 641 

Ksep, 434 

Tor® (A, B), 224 
TrasF(-), 165 
Trx/r(-), trace, xxiv 
Trrd4/r(-), 165 

A‘, transpose, xxi 
tor®(A, B), 224 
tr.deg R, 424 

V(C), Ve(C), 455-456 
Vi), 429 

V(S), 559, 571 
V(fi,---, Sk), 559 

Vr, 532 

uvp(-), 321 

Uoo, 328 

X(S), 388 

X”, 494, 620 

xj(P), 559 

Z(), 268 

Zp, 318 

Z, 437 

ZG, integral group ring, xxiii 


Greek 

ai, 369 

Bo, 574 

Bis 574 

B;, 369, 575 

6(A), 543 

6;;, Kronecker 6, xxi 
€, 149, 195 

Nis 369 

t, 390, 391 

o, 617, 646 
O1,---,0n, 283, 383 
Xo, 62 
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@, 185, 547 


Functors given by subscripts 
and superscripts 

R*, units, xxiv 

Rp, localization, xxiv 

Xt, 194 

M®, invariants, 208 

Mg, coinvariants, 208 

M, dual fractional ideal, 372 
M,, 376 

L®, 460 


Specific functions 
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a = (a |,...,Q@,), multi-index, 494 
|, 620 
Gh, Legendre symbol, 8 


(“), Jacobi sysmbol, 68 
[K : F], degree, xxiv 
Vel Press 05 

| - |, absolute value, 331 
| - |], norm, 356 

(X)o, A)oo, 532 


Isolated symbols 

~, Brauer equivalent, 124 

~, homotopic, 154 

dn, 153, 172 

d_1, 194 

IT. restricted direct product, 388 


Operations on sets and classes 
RG, group algebra, xxiii 

J/1, radical, 405 

K[X1,..-, Xnaila, 458 

A B, morphism, 235 


Miscellaneous 
(x), principal divisor, 532 
(xi )ier, 388 
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I =(rj,rz), generated ideal, 38 
l= (71,12), 38 

[x, y, w], point in P?, 459 
[xo,---,Xn], point in P”, 570 


gy = {(E, gz)}, rational map, 595 
X = {(Xn, On) O_ 4, 171 

(F,| - |~), valued field, 342 
{O(U), pyy}, presheaf, 640 


Abel, 521 
abelian category, 238 
abelian group 
divisible, 196 
torsion, 169 
abelian Lie algebra, 78 
absolute discriminant, 35, 267 
absolute norm of ideal, 39, 273 
absolute value, 289, 331 
archimedean, 289 
discrete, 338 
of idele, 390 
nontrivial, 332 
normalized, 383, 384, 385, 386 
trivial, 331 
acyclic resolution, 219 
additive category, 233 
additive functor, 170, 178 
adele, 389 
adjoint, 252 
affine algebraic set, 559 
dimension of, 566 
irreducible, 563 
affine coordinate ring, 579 
affine curve, irreducible, 529 
affine Hilbert function, 621, 626 
affine Hilbert polynomial, 625, 628 
affine hypersurface, irreducible, 430, 562 
affine local coordinates, 461 
affine n-space, 455, 559 
affine plane curve, 455 
irreducible, 430,524, 562 
affine plane line, 455 
affine scheme, 642 
affine variety, 429, 562 
algebra, xxiii 
abelian Lie, 78 
central, 111 
central simple, 111 
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crossed-product, 137 

cyclic, 122, 162, 163 

generalized quaternion, 121 

Lie, 77 

polynomial, 164 

semisimple, 80 

semisimple Lie, 78 

simple, 80 

simple Lie, 79 

solvable Lie, 78 

Weyl, 85 
algebraic closure, separable, 434 
algebraic set 

affine, 559 

irreducible affine, 563 

projective, 571 
algebraically independent, 409 
aligned primitive forms, 25 
archimedean, 331, 333, 348 
archimedean absolute value, 289 
archimedean place, 383 
archimedean valuation, 289 
Artin product formula, 387, 390, 395 
Artin reciprocity, 265 
Artin’s Theorem, 89 
Artinian ring, 87 
associated prime ideal, 446 
associated translation, 622 
associated vector subspace, 622 
associative algebra 

semisimple, 80 

simple, 80 
augmentation map, 149 


Baer, 168 

base field, 327 

base space, 640 
Bayer—Stillman ordering, 494 
Bezout, 449 
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Bezout’s Theorem, 453, 465, 471, 488 class field, Hilbert, 265 
bidegree, 617 class field theory, 265 
bifunctor, 223 class group 
bihomogeneous polynomial, 617 form, 28 
binary quadratic form, 3, 12 ideal, 42, 265, 299, 330 
similar, 74 idele, 393 
birational, 595 class number, 299, 393 
birational map, 595 Dirichlet, 7, 14 
birationally equivalent, 595 co-invariant, 209 
Blichfeldt, 293 co-invariants functor, 209 
boundary, 172 coboundary, 174 
boundary map, 172 coboundary map, 174 
boundary operator, 172 coboundary operator, 174 
bounded sequence, 317 cochain complex, 173 
bracket, 78 cochain map, 154, 174 
Brauer equivalent, 124 cocycle, 174 
Brauer group, 126 codomain of morphism, 232 
relative, 127 cohomology, 153, 174 
Buchberger, 450 sheaf, 168, 171,218, 643 
Buchberger’s algorithm, 506 coimage in abelian category, 239 
cokernel, 175 
canonical class, 551 of morphism, 236 
canonical divisor, 551 universal mapping property, 236 
Cartan, E., 79 common discriminant divisor, 272 
Cartan, H., 168 common index divisor, 272, 287, 310, 371 
category commutator ideal, 78 
abelian, 238 complete presheaf, 641 
additive, 233 complete valued field, 343 
good, 169 equal-characteristic case, 398 
Cauchy sequence, 317 unequal-characteristic case, 398 
Cayley, 77 completion, 342 
central algebra, 111 universal mapping property of, 343 
central simple algebra, 111 complex, 171 
centralizer, 114 in abelian category, 240 
chain complex, 171 chain, 171 
in abelian category, 240 cochain, 173 
double, 257 double, 257 
tensor product for, 258 flat, 259 
chain map, 154, 155, 173 complex place, 383 
character composition formula, 24 
Dirichlet, 62 condition (C1), 165, 518 
genus, 74 cone, 572, 633 
multiplicative, 61 conic, 458 
principal Dirichlet, 62 conjugate, 266, 288, 383 
Chase, 141 connecting homomorphism, 185, 187 
Chevalley, 165, 168 connecting morphism in abelian category, 248 
Chinese Remainder Theorem, xxiii, 30, 69, convergent infinite product, 51 


106, 314, 341, 367, 480, 483 convergent sequence, 317 


coordinate, 455, 559 

affine local, 461 
coordinate hyperplane, 620 
coordinate ring 

affine, 579 

homogeneous, 584 
coordinate subspace, 619 
coproduct, xxiii 
correspondence, one-one, xxi 
countable, xxi 
Cramer, 448 
Cramer’s paradox, 449 
Cramer’s rule, 448 
crossed-product algebra, 137 
cubic, 458 

twisted, 562 
cubic extension, pure, 280 
cubic number field, 279 
cubical singular chain, 172 
cubical singular homology, 172 
cup product, 256 
curve 

affine plane, 455 

elliptic, 648 

irreducible, 604 

irreducible affine, 529 

irreducible affine plane, 430, 524, 562 

projective plane, 458 
cycle, 172 
cyclic algebra, 122, 162, 163 
cyclotomic field, 309 


decomposition group, 368 
Dedekind, 77 
Dedekind Discriminant Theorem, 275, 371, 
379, 381 

Dedekind domain, xxiv, 266 

extension of, xxiv, 327, 417 
Dedekind example, 287, 310 
Dedekind’s Theorem on Differents, 376 
defined at a point, 580, 585 
degenerate, 172 
degree, 153 

of divisor, 533 

of inseparability, 415 

residue class, 275, 354, 533 

total, 457 

transcendence, 413 
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derived functor, 204 

formation of, 205 

long exact sequence, 209, 214 
Dickson, 122 
different, 279 

relative, 279, 372 
differential, 543, 547 
differential form, 541 
dimension 

of affine algebraic set, 566 

of affine variety, 563 

geometric, 565 

Krull, 424, 528, 529, 564 

of zero locus, 423 
Diophantus, 1 
direct product, restricted, 388 
direct sum in additive category, 233 
directed set, 438 
Dirichlet, 2,24, 77 
Dirichlet box principle, 297 
Dirichlet character modulo m, 62 
Dirichlet class number, 7, 14 
Dirichlet L function, 63 
Dirichlet pigeonhole principle, 297 
Dirichlet series, 56 
Dirichlet Unit Theorem, 290, 292, 384, 390, 

395 

Dirichlet’s Theorem, 7, 50 
discrete, 290 
discrete absolute value, 338 
discrete valuation, 322 

defined over k, 529 
discriminant, 12 

absolute, 35, 267 

of commutative semisimple algebra, 382 

field, 35, 264, 267 

fundamental, 33 

of ordered basis, 267 

relative, 275, 381 
discriminant divisor, 272 
divisible abelian group, 196 
divisible module, 251 
division algorithm, generalized, 499 
divisor, 532 

principal, 532 
divisor class, 532, 549 
domain of morphism, 232 
dominant rational map, 595 
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Double Centralizer Theorem, 115 
double chain complex, 257 
dual of fractional ideal, 372 


Eckmann, 168 
Eilenberg, 168 
Eisenstein, 12 
Eisenstein polynomial, 402 
elimination ideal, 512 
Elimination Theorem, 512 
elimination type ordering, 494, 512 
elliptic curve, 648 
enough injectives, 202 
enough projectives, 202 
epi, 233 
epimorphism, 233 
equal-characteristic case, 398 
equivalence class of forms 
ordinary, 13 
proper, 13 
equivalence of 
absolute values, 333 
completions, 383 
forms, 13, 32 
improper, 13 
proper, 13, 32 
ideals, 40, 298 
narrow, 40 
strict, 40, 298 
morphisms, 242 
Euler, 1,3,9,50 
Euler product, 50, 54, 60 
first-degree, 60 
Euler’s Theorem, 516, 646 
exact complex, 175 
exact functor, 179 
left, 182 
right, 183 
exact on injectives, 222 
exact on projectives, 222 
exact sequence, 175 
in abelian category, 240 
long, 187, 188 
short, 175 
split, 200 
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